Systems and methods for compressed sensing measurement of long-range correlated noise

ABSTRACT

A method for detecting a two-qubit correlated dephasing error includes accessing a signal of a quantum system, where the quantum system includes a plurality of qubits. Every qubit has a nonzero rate of dephasing and some qubits have a nonzero rate of correlated dephasing. The signal further includes information about a matrix that includes diagonal elements and off-diagonal elements. The off-diagonal elements of the matrix are 2 s-sparse. The method further includes performing randomized measurements of the off-diagonal elements of the matrix and recovering the matrix based on a direct measurement of the diagonal elements of the matrix.

CROSS-REFERENCE TO RELATED APPLICATION/CLAIM OF PRIORITY

This application claims the benefit of, and priority to, U.S.Provisional Patent Application No. 63/188,373, filed on May 13, 2021,the entire contents of which are hereby incorporated herein byreference.

GOVERNMENT SUPPORT

This invention was made with government support under W911NF1510397awarded by the Army Research Office (ARO) and under FA95501810161awarded by the Air Force Office of Scientific Research (AFOSR). Thegovernment has certain rights in the invention.

TECHNICAL FIELD

The present disclosure relates generally to the field of noisyintermediate-scale quantum information (NISQ) processors. Morespecifically, an aspect of the present disclosure provides detectingtwo-qubit correlated dephasing errors in NISQ devices.

BACKGROUND

The development of noisy intermediate-scale quantum informationprocessors (NISQ devices) has the potential to advance many areas ofcomputational science. An important problem is the characterization ofnoise processes in these devices, in order to improve their performance(via calibration and error correction), and to ensure correctinterpretation of the results.

Accordingly, there is interest in improving measurements that have themathematical properties needed for compressed sensing, and can beimplemented efficiently on a quantum device.

SUMMARY

An aspect of the present disclosure provides computer-implemented methodfor detecting a two-qubit correlated dephasing error. The methodincludes accessing a signal of a quantum system, where the quantumsystem includes a plurality of qubits. Every qubit has a nonzero rate ofdephasing and some qubits have a nonzero rate of correlated dephasing.The signal includes information about a matrix that includes diagonalelements and off-diagonal elements. The off-diagonal elements of thematrix are 2 s-sparse. The method further includes performing randomizedmeasurements of the off-diagonal elements of the matrix and recoveringthe matrix based on a direct measurement of the diagonal elements of thematrix.

In accordance with aspects of the disclosure, the method may furtherinclude estimating a two-qubit correlated dephasing error based on therecovered matrix.

In an aspect of the present disclosure, performing the randomizedmeasurements may include preparing entangled Greenberger-Horne-Zeilingerstates (GHZ states) on random subsets of the plurality of qubits.

In another aspect of the present disclosure, the randomized measurementsmay be based on noise spectroscopy and quantum sensing.

In yet another aspect of the present disclosure, the recovered matrixmay include a restricted isometry property.

In accordance with further aspects of the present disclosure, the methodmay further include estimating a vector of decay rates that aredependent on the matrix.

In an aspect of the present disclosure, the recovered matrix may befurther based on the estimated decay rate.

In another aspect of the present disclosure, the method may furtherinclude estimating a relaxation time of the quantum system based on theestimated decay rate.

In yet another aspect of the present disclosure, the method may furtherinclude estimating a decoherence time of the quantum system based on theestimated decay rate.

In yet another aspect of the present disclosure, the method may furtherinclude detecting a long-range correlated dephasing error based on therecovered matrix.

An aspect of the present disclosure provides a computer-implementedmethod for detecting a two-qubit correlated dephasing error, the methodincluding: accessing a signal of a quantum system, the signal includinga plurality of qubits; dephasing entangled states of the plurality ofqubits based on performing Ramsey spectroscopy using entangled states ofrandom subsets of qubits of the plurality of qubits; measuring a linearfunction of a correlation matrix, where the correlation matrixcorresponds to correlated Markovian dephasing between pairs of qubits ofthe plurality of qubits; generating a first vector and a second vector;and estimating a decay rate based on the first vector and the secondvector. A plurality of elements of the first vector and of the secondvector are randomly chosen. Every qubit of the plurality of qubits has anonzero rate of dephasing and some qubits have a nonzero rate ofcorrelated dephasing.

In accordance with further aspects of the present disclosure, the methodmay further include generating a restricted isometry property-basedrecovery matrix. The recovery matrix may be further based on theestimated decay rate.

In another aspect of the present disclosure, the method may furtherinclude detecting a long-range correlated dephasing error based on therecovery matrix.

In yet another aspect of the present disclosure, the method may furtherinclude estimating a relaxation time of the quantum system based on theestimated decay rate.

In a further aspect of the present disclosure, the method may furtherinclude estimating a decoherence time of the quantum system based on theestimated decay rate.

An aspect of the present disclosure provides a system for detecting atwo-qubit correlated dephasing error, where the system includes: aprocessor; and a memory, including instructions stored thereon, whichwhen executed by the processor, cause the system to: access a signal ofa quantum system, the signal includes a plurality of qubits; andgenerate a matrix C in R^(n×n), wherein its off-diagonal part is 2 ssparse. The matrix C is based on the accessed signal. Every qubit has anonzero rate of dephasing and some qubits have a nonzero rate ofcorrelated dephasing.

In yet a further aspect of the present disclosure, the instructions,when executed by the processor, may further cause the system to performrandomized measurements based on: estimating (b−a)^(T)C(b−a) for any a≠bin {0,1}n; choosing a sequence of random vectors r(1), . . . , r(m)in{1,0,−1}n; and estimating ΦC=(r^((1){circumflex over ( )}T)Cr⁽¹⁾), . . ., (r^((m){circumflex over ( )}T)Cr^((m))), where Φ:R^(n×n)→R^(m).

In yet a further aspect of the present disclosure, the instructions,when executed by the processor, may further cause the system to estimatea decay rate Γ_(ab) based on any a≠b in {0,1}^(n), and estimate adecoherence time of the quantum system based on the decay rate Γ_(ab).

In an aspect of the present disclosure, the instructions, when executedby the processor, may further cause the system to recover matrix C basedon measuring a plurality of diagonal elements of matrix C directly anddetermining an off-diagonal element of Matrix C based on using

₁-regularized least-squares regression.

In an aspect of the present disclosure, the instructions, when executedby the processor, may further cause the system to detect a long-rangecorrelated dephasing error based on the recovered matrix C.

Further details and aspects of exemplary embodiments of the presentdisclosure are described in more detail below with reference to theappended figures.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the features and advantages of the presentdisclosure will be obtained by reference to the following detaileddescription that sets forth illustrative embodiments, in which theprinciples of the present disclosure are utilized, and the accompanyingdrawings of which:

FIG. 1 is a diagram of an exemplary noisy intermediate-scale quantuminformation (NISQ) processor, in accordance with examples of the presentdisclosure;

FIG. 2 is a high level flow diagram of the computer-implemented methodfor detecting a two-qubit correlated dephasing error, in accordance withexamples of the present disclosure;

FIG. 3 is a correlation graph of a noise model of the quantum system ofFIG. 1 , in accordance with examples of the present disclosure;

FIG. 4 illustrates a flow diagram of the computer-implemented method 600for determining a two-qubit correlated dephasing error for the quantumsystem 100, such as the quantum system of FIG. 1 , in accordance withexamples of the present disclosure;

FIG. 5 is a graph that illustrates single qubit Ramsey spectroscopy, inaccordance with examples of the present disclosure;

FIG. 6 is a correlation matrix C corresponding to the correlation graphof FIG. 3 , in accordance with examples of the present disclosure;

FIG. 7A is a table that illustrates sample complexity of differentmethods for reconstructing the correlation matrix C, in accordance withexamples of the present disclosure;

FIG. 7B is a table that illustrates sample complexity of differentmethods for reconstructing the correlation matrix C, in accordance withexamples of the present disclosure;

FIGS. 8A-8C are graphs that illustrate scaling of the reconstructionerror under various circumstances, in accordance with examples of thepresent disclosure;

FIG. 9A is a graph that illustrates trajectories of a random walk usedin estimating an evolution time, in accordance with examples of thepresent disclosure;

FIG. 9B is a graph that illustrates the accuracy of a final estimate ofthe evolution time, in accordance with examples of the presentdisclosure;

FIGS. 10A-10C are graphs that illustrate an exponential decay of {tildeover (P)}_(ab)(t) when there are state-preparation-and-measurement(SPAM) errors for a 3-qubit Greenberger-Horne-Zeilinger (GHZ) state, inaccordance with examples of the present disclosure; and

FIG. 11 illustrates a flow diagram of the computer-implemented methodfor determining a two-qubit correlated dephasing error for the quantumsystem, such as the quantum system of FIG. 1 , in accordance withexamples of the present disclosure.

DETAILED DESCRIPTION

The present disclosure relates generally to the field of noisyintermediate-scale quantum information (NISQ) processors. Morespecifically, an aspect of the present disclosure provides detectingtwo-qubit correlated dephasing errors in NISQ devices.

Embodiments of the present disclosure are described in detail withreference to the drawings wherein like reference numerals identifysimilar or identical elements.

Although the present disclosure will be described in terms of specificexamples, it will be readily apparent to those skilled in this art thatvarious modifications, rearrangements, and substitutions may be madewithout departing from the spirit of the present disclosure. The scopeof the present disclosure is defined by the claims appended hereto.

For purposes of promoting an understanding of the principles of thepresent disclosure, reference will now be made to exemplary embodimentsillustrated in the drawings, and specific language will be used todescribe the same. It will nevertheless be understood that no limitationof the scope of the present disclosure is thereby intended. Anyalterations and further modifications of the novel features illustratedherein, and any additional applications of the principles of the presentdisclosure as illustrated herein, which would occur to one skilled inthe relevant art and having possession of this disclosure, are to beconsidered within the scope of the present disclosure.

The notation x∝y means that x and y are proportional, i.e., x=cy where cis a “universal” constant. A universal constant is a quantity whosevalue is fixed once and for all, and does not depend on any othervariable. Polylogarithmic factors are written in a compact way asfollows: log^(c)n=(log n)^(c). Big-O notation, such as O(n³), denotes anasymptotic upper bound. Matrices are denoted by capital letters. For avector v, ∥v∥_(p) denotes the

_(p) norm (in cases where p=1,2, ∞). For a matrix M,∥M∥_(F)=(Σ_(jk)|M_(jk)|²)^(1/2) denotes the Frobenius norm (i.e., theSchatten 2-norm, or the

₂ norm of the vector containing the entries of M), ∥M∥ denotes theoperator norm (i.e., the Schatten ∞-norm), ∥M∥_(tr) denotes the tracenorm (i.e., the Schatten 1-norm, or the nuclear norm), and ∥M

=Σ_(jk)|M_(jk)| denotes the

norm of the vector containing the entries of M. Given an n×n matrix M,let uvec(M)=(M_(ij))_(i<j) denote the vector (of dimension n(n−1)/2)containing the entries in the upper-triangular part of M, excluding thediagonal. Statistical estimators are written with a hat superscript,e.g., {circumflex over (Γ)} is a random variable that represents anestimator for some unknown quantity Γ. For a random variable {circumflexover (Γ)}, ∥{circumflex over (Γ)}∥_(ψ) ₂ denotes the sub-Gaussian norm,and ∥{circumflex over (Γ)}∥_(ψ) ₁ denotes the subexponential norm.

The development of noisy intermediate-scale quantum informationprocessors (NISQ devices) has the potential to advance many areas ofcomputational science. An important problem is the characterization ofnoise processes in these devices, in order to improve their performance(via calibration and error correction), and to ensure correctinterpretation of the results. One challenge is to characterize all ofthe noise processes that are likely to occur in practice, using someexperimental procedure that is efficient and can scale up to largenumbers of qubits.

Compressed sensing offers one approach to solving this problem, whereone uses specially-designed measurements (and classical post-processing)to learn an unknown signal that has some prescribed structure. Forexample, the unknown signal can be a sparse vector or a low-rank matrix,the measurements can consist of random projections sampled from variousdistributions, and the classical post-processing can consist of solvinga convex optimization problem (e.g., minimizing the

₁ or trace norm), using efficient algorithms. This approach has beenused in several previous works on quantum state and process tomography,and estimation of Hamiltonians and Lindbladians.

From a theoretical perspective, one of the main challenges is to designmeasurements that have the mathematical properties needed for compressedsensing, and can be implemented efficiently on a quantum device. Therehas been a substantial amount of work in this area, which can be broadlyclassified into two approaches: “sparsity-based” and “low-rank”compressed sensing. For compressed sensing of “low-rank” objects (e.g.,low-rank density matrices and quantum processes), there seems to be afew natural choices for measurements, including random Paulimeasurements, and fidelities with random Clifford operations. Forcompressed sensing of “sparse” objects (e.g., sparse Hamiltonians,Lindbladians, or Pauli channels), however, the situation is morecomplicated, as a larger number of different measurement schemes andclassical post-processing methods have been proposed, and the optimaltype of measurement seems to depend on the situation at hand. Thiscomplicated state of affairs can be explained in part because “sparsity”occurs in a wider variety of situations than “low-rankness.”

Long-range correlated errors can severely impact the performance of NISQ(noisy intermediate-scale quantum) devices, and fault-tolerant quantumcomputation. Characterizing these errors is important for improving theperformance of these devices, via calibration and error correction, andto ensure correct interpretation of the results. A compressed sensingmethod for detecting two-qubit correlated dephasing errors may be usedfor characterizing these errors, assuming only that the correlations aresparse (i.e., at most s pairs of qubits have correlated errors, wheres<<n(n−1)/2, and n is the total number of qubits). In particular, thedisclosed method can detect long-range correlations between any twoqubits in the quantum system (i.e., the correlations are not restrictedto be geometrically local).

The disclosed method is highly scalable, because the disclosed methodrequires as few as m∝s log n measurement settings (where ∝ denotesproportionality), and efficient classical post-processing based onconvex optimization. In addition, when m∝s log⁴n, the disclosed methodis highly robust to noise, and has sample complexity O(max(n, s)²log⁴(n)), which can be compared to conventional methods that have samplecomplexity O(n³). Thus, the disclosed method is advantageous when thecorrelations are sufficiently sparse, that is, when s≤O(n^(3/2)/log²n).The disclosed method also performs well in numerical simulations onsmall quantum system sizes, and has some resistance tostate-preparation-and-measurement (SPAM) errors. A feature of thedisclosed method is a new type of compressed sensing measurement, whichworks by preparing entangled Greenberger-Horne-Zeilinger states (GHZstates) on random subsets of qubits, and measuring their decay rateswith high precision.

Characterizing errors when building large-scale quantum devices isneeded for calibrating, correcting the errors, and interpreting theresult of the quantum devices correctly. For example, quantumsimulations have only a limited simulation time. Metrology and quantumsensing may lead to reduced sensitivity, while error correction andfault tolerance require increased overhead.

Thus, there is a need to find methods for characterizing all the likelynoise processes in quantum systems that are efficient and scalable.

Referring to FIG. 1 , a diagram of an example quantum system 100, isshown. The quantum system 100 may include, for example, a NISQ (noisyintermediate-scale quantum) device (i.e., a NISQ processor). The term“noisy” refers to the fact that quantum processors are very sensitive tothe environment and may lose their quantum state due to quantumdecoherence (i.e., dephasing). The quantum system 100 includes qubits102 which are linked by a coupler 104. The qubits are the quantumversion of classic binary bits realized with a two-statequantum-mechanical system. The coupler 104, for example, may beimplemented by connecting the two qubits to an intermediate electricalcoupling circuit. The circuit might be a fixed element, such as acapacitor, or a controllable element. It is contemplated that thequantum system 100 may include any suitable number of qubits 102 and/orcouplers 104.

The theory of “sparsity-based” quantum compressed sensing may be appliedto a physical problem that is relevant to the development of quantumsystem 100 (e.g., a NISQ device), such as detecting long-rangecorrelated dephasing errors. A simple model of correlated dephasing, isdescribed by the following Markovian master equation:

$\frac{d\rho}{dt} = {{\mathcal{L}(\rho)} = {\sum_{j,{k = 1}}^{n}{{c_{jk}( {{Z_{k}\rho Z_{j}} - {\frac{1}{2}\{ {{Z_{j}Z_{k}},\rho} \}}} )}.}}}$

Here the quantum system 100 consists of n qubits, and Z_(j) and Z_(k)are Pauli σ_(z) operators that act on the j'th and k'th qubits,respectively. The noise is then completely described by the correlationmatrix C=(c_(jk))∈

^(n×n). (See FIGS. 3 and 6 .) Generalizations of this model with complexcoefficients c_(jk), and an additional environment-induced Hamiltonianare considered.

Here, the diagonal elements c_(jj) show the rates at which single qubitsdephase, and the off-diagonal elements c_(jk) show the rates at whichpairs of qubits undergo correlated dephasing. Typically, the diagonal ofC will be nonzero, while the off-diagonal part may be dense or sparse,depending on the degree of connectivity between the qubits and theirenvironment.

This master equation describes a number of physically plausiblescenarios, such as spin-½ particles coupling to a shared bosonic bath.It has also been studied as an example of how collective decoherence canaffect physical implementations of quantum computation and quantumsensing.

This model of correlated dephasing is quite different from other modelsof crosstalk that are based on quantum circuits or Pauli channels. Thedisclosed model describes crosstalk that arises from the physicalcoupling of the qubits to their shared environment. The disclosed modelhas a different character from crosstalk that arises from performingimperfect two-qubit gates, or correlated errors that result when thephysical noise processes are symmetrized by applying quantum operationssuch as Pauli twirls.

The disclosed model of correlated dephasing can be learned efficientlywhen the off-diagonal part of the correlation matrix C is sparse. It isassumed that C has at most s<<n(n−1)/2 nonzero entries above thediagonal, but these entries may be distributed arbitrarily; inparticular, long-range correlated errors are allowed. (Note thatphysical constraints imply that C is positive semidefinite, henceC=C^(†))

The fact that the disclosed model enables long-range correlations isquite powerful, and is relevant to a number of interesting settings inquantum information science. These settings include experimental NISQdevices, which are often engineered to have long-range interactions, inorder to perform quantum computations more efficiently; the execution ofquantum circuits on distributed quantum computers, where long-rangecorrelations are generated when qubits are moved or teleported from onelocation to another; and quantum sensor arrays, where a detection eventat an arbitrary location (j,k) in the array may be registered as apairwise correlation between a qubit that is coupled to row j and aqubit that is coupled to column k in the array.

Referring to FIG. 2 a high level flow diagram of a computer-implementedmethod 200 for detecting a two-qubit correlated dephasing error isshown. A quantum system 100 such as a NISQ device is observed undernoisy evolution 606 (FIG. 4 ). Random measurements 210 (e.g., Ramseyspectroscopy, FIG. 5 ) are made and a sparse correlation matrix C 400(FIG. 6) is characterized based on the random measurements 210. Thecorrelation matrix C 400 corresponds to correlated Markovian dephasingbetween pairs of qubits.

Referring to FIG. 3 an illustration of a correlation graph of a noisemodel 300 of the quantum system 100 of FIG. 1 is shown. The qubits 302experience correlated Markovian dephasing. Non-zero c_(ij) are shown,indicating correlated noise affecting the pairs of qubits connected bythe lines 304.

FIG. 4 illustrates a computer-implemented method 600 for determining atwo-qubit correlated dephasing error for the quantum system 100, such asthe quantum system of FIG. 1 . Initially, a general form of Ramseyspectroscopy using entangled states on multiple qubits is described. Asimple method for directly measuring the correlation matrix C 400, byperforming spectroscopy on every pair of qubits is described. Thissimple method serves as a baseline for measuring the performance of thedisclosed compressed sensing method. It is contemplated that thedisclosed methods may be performed by a processor and memory. The memorymay include instructions thereon, which are executed by the processor.

Initially, a signal is accessed from a quantum system 100. The quantumsystem 100 contains a plurality of qubits, every qubit has a nonzerorate of dephasing, and some pairs of qubits have a nonzero rate ofcorrelated dephasing. At step 602, vectors a and b whose elements arerandomly chosen from {0,1} are generated. At step 604, the operationU_(ab) prepares the state

$  { {\frac{1}{\sqrt{2}}( {❘a} } \rangle + {❘b}} \rangle ).$

At step bub the quantum system 100 evolves under dephasing noise fortime t, represented by Et. At step 608, U_(ab) ^(†) is applied and acomputational basis measurement is performed. The probability ofobtaining the outcome 0, P_(ab), decays exponentially (towards ½) as tincreases. At step 610, matrix C 400 (FIG. 6 ) may be recovered bymeasuring the decay rate Γ_(ab) for various a's and b's.

Here it is assumed that the entries in the matrix C 400 are real (i.e.,with imaginary part equal to zero). This holds true in a number ofcases, for instance, when the qubits are coupled to a bath at a hightemperature. When C is complex, it can be handled using a generalizationof the disclosed method. Note that physical constraints imply that C ispositive semidefinite; hence C=C^(T) in the real case, and C=C^(†) inthe complex case.

In aspects, the entangled states may undergo dephasing. The followingprocedure enables the measurement of certain linear functions of thecorrelation matrix C 400. This procedure is very general, and includessingle- and two-qubit Ramsey spectroscopy as special cases. Consider ann-qubit state of the form

$\begin{matrix}{{  { { {❘\psi_{ab}} \rangle = {\frac{1}{\sqrt{2}}( {❘a} }} \rangle + {❘b}} \rangle ) \in ( {\mathbb{C}}^{2} )^{\otimes n}},} & ( {{Eq}.2.1} )\end{matrix}$

where a, b∈{0,1}^(n), a≠b, |a

=|a₁, a₂, . . . , a_(n)

and |b

=|b₁, b₂, . . . , b_(n)

. By choosing a and b appropriately, one can make |ψ_(ab)

be a single-qubit |+

state, a two-qubit Bell state, or a many-qubitGreenberger-Horne-Zeilinger state (GHZ state) (while the other qubitsare in a tensor product of |0

and |1

states).

For example, the state |ψ_(ab)

may be prepared, the state may be allowed to evolve for time t under theLindbladian (Eq. 1.1). ρ(t) is the resulting density matrix. As can beseen in Eq. (1.2), the coherences (that is, the off-diagonal elements |a

b|) of ρ(t) decay as exp(−Γ_(ab)t), where the decay rate Γ_(ab) ∈

is defined so that

(|a

b|)=−Γ_(ab)|a

b|. This decay rate can be estimated by allowing the quantum system 100to evolve for a suitable amount of time t, and then measuring in thebasis

$  { {\frac{1}{\sqrt{2}}( {❘a} } \rangle + {❘b}} \rangle ).$

The decay rate Γ_(ab) shows a certain linear function of the correlationmatrix C, which can be written explicitly as follows. Let α_(i)=(−1)^(a)^(i) and β_(i)=(−1)^(b) ^(i) denote the expectation values of Z_(i)corresponding to the states |a

and |b

, respectively. In addition, define the vectors α=(α₁, . . . , α_(n))and β=(β₁, . . . , β_(n)).

$\begin{matrix}{\Gamma_{ab} = {- {\sum_{ij}{c_{ij}\lbrack {{\alpha_{i}\beta_{j}} - {\frac{1}{2}\alpha_{i}\alpha_{j}} - {\frac{1}{2}\beta_{i}\beta_{j}}} \rbrack}}}} & ( {{Eq}.2.2} )\end{matrix}$ $\begin{matrix}{{= {2r^{T}{Cr}}},} & ( {{Eq}.2.3} )\end{matrix}$

Recall the definition of r, r=b−a, (Eq. 2.4) note that

${r = \frac{\alpha - \beta}{2}},$

and the fact that C=C^(T) is used to symmetrize the equation.

Single-qubit Ramsey spectroscopy (FIG. 4 ) is a special case of thisprocedure, where a=(0, 0, . . . , 0) and b=(0, . . . , 0,1,0, . . . , 0)(where the 1 appears in the j'th position). Then |ψ_(ab)

is a |+

state on the j'th qubit, and Γ_(ab)=2c_(jj) tells the rate of dephasingon the j'th qubit.

Two-qubit generalized spectroscopy is another special case, where a=(0,0, . . . , 0) and b=(0, . . . , 0,1,0, . . . , 0,1,0, . . . , 0) (wherethe l's appear in the j'th and k'th positions). Then |ψ_(ab)

is a maximally-entangled state on qubits j and k, andΓ_(ab)=2(c_(jj)+c_(jk)+c_(kj)+c_(kk)) provides information about therate of correlated dephasing on qubits j and k.

An advantage of the disclosed methods is that the methods enable thecharacterization of errors. The characterized errors may be used toimprove quantum systems (and devices) to reduce errors. For example,errors may be mitigated by applying a gate opposite to the characterizederror. In another example, there are many quantum systems 100 thatinclude a master clock, that is configured to track the phase of all ofthe qubits, so if dephasing occurs, the system loses the ability totrack the phase of the qubits. The disclosed methods enable mitigationof the dephasing error based on the characterized error.

FIG. 5 is a graph that illustrates single qubit Ramsey spectroscopy.Ramsey spectroscopy is a form of particle interferometry that uses thephenomenon of magnetic resonance to measure transition frequencies ofparticles. The plot 502 shows the overlap P₊, which decays exponentially(towards ½) with decay rate Γ, as a function of time t. The inset blockdiagram 504 illustrates the Ramsey protocol, where a superposition ofqubit states is prepared with the first Hadamard gate H, the quantumsystem 100 undergoes dephasing (represented by the noise channel ε_(t))for time t, and the second H followed by a measurement in thecomputational basis measures the overlap P₊.

In aspects, the decay rate Γ_(ab) may be estimated by:

1. Choosing some evolution time t≥0 such that

$\frac{1}{2} \leq {\Gamma_{ab}t} \leq 2.$

This can be done in various ways, for instance, by starting with aninitial guess t=τ₀ and performing binary search, using ˜|log(Γ_(ab)τ₀)|trials of the experiment.

2. Repeating the following experiment N_(trials) times: (N_(trials) maybe set using equation (2.11) below)

(a) Preparing the state

$  { { {❘\psi_{ab}} \rangle = {\frac{1}{\sqrt{2}}( {❘a} }} \rangle + {❘b}} \rangle ),$

allow the state to evolve for time t, then measure in the basis

$  { {\frac{1}{\sqrt{2}}( {❘a} } \rangle + {❘b}} \rangle ).$

Letting N₊ and N⁻ be the number of

$  { {  { {\frac{1}{\sqrt{2}}( {❘a} } \rangle + {❘b}} \rangle ){and}\frac{1}{\sqrt{2}}( {❘a} } \rangle + {❘b}} \rangle )$

outcomes, respectively. Note that the probabilities of these outcomesare given by

$P_{+} = {{\frac{1}{2}( {1 + e^{{- \Gamma_{ab}}t}} ){and}P_{-}} = {\frac{1}{2}{( {1 - e^{{- \Gamma_{ab}}t}} ).}}}$

3. Defining:

$\begin{matrix}{\Delta = {\frac{N_{+} - N_{-}}{N_{t{rials}}}.}} & ( {{Eq}.2.5} \end{matrix}$

Note that Δ is an unbiased estimator for P₊−P⁻, that is,

(Δ)=P₊−P⁻=e^(−Γ) ^(ab) ^(t). This motivates the definition of anestimator for Γ_(ab):

$\begin{matrix}{{\overset{\hat{}}{\Gamma}}_{ab} = {{- \frac{1}{t}}{{\ln(\Delta)}.}}} & ( {{Eq}.2.6} )\end{matrix}$

Next, some bounds on the accuracy of the estimator {circumflex over(Γ)}_(ab) may be set. To do this, the notion of a sub-Gaussian randomvariable is introduced (e.g., a random variable whose moments and tailprobabilities behave like those of a Gaussian distribution). Areal-valued random variable X is sub-Gaussian if there exists a realnumber K₂ such that, for all p≥1, (

(|X|^(p)))^(1/p)≤K₂√{square root over (p)}. (Eq. 2.7)

The sub-Gaussian norm of X, denoted ∥X∥_(ψ) ₂ , is defined to be thesmallest choice of K₂ in (2.7), i.e.,

$\begin{matrix}{{X}_{\psi_{2}} = {\sup\limits_{p \geq 1}{{p^{{- 1}/2}( {{\mathbb{E}}( {❘X❘}^{p} )} )}^{1/p}.}}} & ( {{Eq}.2.8} )\end{matrix}$

In addition, it is known that X is sub-Gaussian if and only if thereexists a real number K₁ such that, for all t≥0, Pr[|X|>t]≤exp(1−t²/K₁²). (Eq. 2.9)

The smallest choice of K₁ in (Eq. 2.9) is equivalent to the sub-Gaussiannorm∥X∥_(ψ) ₂ , in the following sense: there is a universal constant csuch that, for all sub-Gaussian random variables X, the smallest choiceof K₁ in (Eq. 2.9) differs from ∥X∥_(ψ) ₂ by at most a factor of c.

{circumflex over (Γ)}_(ab)−Γ_(ab) is a sub-Gaussian random variable,whose sub-Gaussian norm is bounded by

$\begin{matrix}{{{{{\overset{\hat{}}{\Gamma}}_{ab} - \Gamma_{ab}}}_{\psi_{2}} \leq {C_{0}\frac{\Gamma_{ab}}{\sqrt{N_{trials}}}}},} & ( {{Eq}.2.1} )\end{matrix}$

where C₀ is a universal constant (e.g., 1/√{square root over(N_(trials))}) In particular, the accuracy of {circumflex over (Γ)}_(ab)can be controlled by setting N trials appropriately: for any δ>0 and∈>0, where

$\begin{matrix}{{N_{trials} \geq {\frac{2}{\delta^{2}}{\ln( \frac{2}{\epsilon} )}}},} & ( {{Eq}.2.11} )\end{matrix}$

then {circumflex over (Γ)}_(ab) satisfies the following error bound:with probability at least 1−∈, |{circumflex over (Γ)}_(ab)−Γ_(ab)|≤2δe²Γ_(ab). (Eq. 2.12)

This shows that the error in {circumflex over (Γ)}_(ab) is at most asmall fraction of the true value of Γ_(ab), independent of the magnitudeof Γ_(ab).

In aspects, this may derive an error bound that involves {circumflexover (Γ)}_(ab) rather than Γ_(ab), and may be computed from the observedvalue of {circumflex over (Γ)}_(ab). To show such an error bound, thetriangle inequality may be used to write |{circumflex over(Γ)}_(ab)−Γ_(ab)|≤2δe²(|{circumflex over (Γ)}_(ab)|+{circumflex over(Γ)}_(ab)−Γ_(ab)|), and divide by (1−2δe²) to obtain:

$\begin{matrix}{{❘{{\overset{\hat{}}{\Gamma}}_{ab} - \Gamma_{ab}}❘} \leq {\frac{2\delta e^{2}}{1 - {2\delta e^{2}}}{{❘{\overset{\hat{}}{\Gamma}}_{ab}❘}.}}} & ( {{Eq}.2.13} )\end{matrix}$

It remains to prove (Eq. 2.10) and (Eq. 2.12). First use Hoeffding'sinequality to show that Δ is close to its expectation value:Pr[|Δ−(P₊−P⁻)|≥δ]≤2 exp(−N_(trials)δ²/2). (Eq. 2.14)

When |Δ−(P₊−P⁻)|≤δ, the error in {circumflex over (Γ)}_(ab) may bebounded as follows: (using the fact that P₊−P⁻=e^(−Γ) ^(ab) ^(t)≥e⁻² and

$ {t \geq \frac{1}{2\Gamma_{ab}}} )$

$\begin{matrix}\begin{matrix}{{- {\overset{\hat{}}{\Gamma}}_{ab}} \leq {\frac{1}{t}{\ln( {P_{+} - P_{-} + \delta} )}}} \\{\leq {\frac{1}{t}\lbrack {{\ln( {P_{+} - P_{-}} )} + {\delta e^{2}}} \rbrack}} \\{{\leq {{- \Gamma_{ab}} + {2\Gamma_{ab}\delta e^{2}}}},}\end{matrix} & ( {{Eq}.2.15} )\end{matrix}$ $\begin{matrix}\begin{matrix}{{- {\overset{\hat{}}{\Gamma}}_{ab}} \geq {\frac{1}{t}{\ln( {P_{+} - P_{-} - \delta} )}}} \\{\geq {\frac{1}{t}\lbrack {{\ln( {P_{+} - P_{-}} )} - {\delta e^{2}}} \rbrack}} \\{\geq {{- \Gamma_{ab}} - {2\Gamma_{ab}\delta{e^{2}.}}}}\end{matrix} & ( {{Eq}.2.16} )\end{matrix}$

Thus:

Pr[|{circumflex over (Γ)}_(ab)−Γ_(ab)|≥2δe ²Γ_(ab)]≤2 exp(−N_(trials)δ²/2).  (Eq. 2.17)

(Eq. 2.17) implies (Eq. 2.10) and (Eq. 2.12).

Referring to FIG. 6 , matrix C 400 corresponding to the correlationgraph in FIG. 3 is shown. The diagonal elements correspond to singlequbit dephasing 402. The off-diagonal elements indicate correlateddephasing noise 404. In aspects, a direct estimation of the correlationmatrix C 400 may be performed. The correlation matrix C 400 may beestimated directly, by performing single-qubit spectroscopy to measurethe diagonal elements c_(jj), and by performing two-qubit spectroscopyto measure the off-diagonal elements c_(jk). This method may be used asa baseline to measure the performance of the disclosed compressedsensing method.

For simplicity, the case where C is real is considered. Since C ispositive definite, this implies that c_(jj)≥0 and c_(jk)=c_(kj).

First, the diagonal elements c_(jj), for j=1, n, are estimated asfollows:

1. Let a=(0, 0, . . . , 0) and b=(0, . . . , 0,1,0, . . . , 0) (wherethe 1 appears in the j'th position).

2. Construct an estimate {circumflex over (Γ)}_(ab) of the decay rateΓ_(ab)=2c_(jj). Define g_(j)={circumflex over (Γ)}_(ab)/2.

To write this in a compact form, define diag(C)=(c₁₁, . . . , c_(nn)),(Eq. 2.18)

that is, the diagonal of C. Next, define g=(g₁, . . . , g_(n)), (Eq.2.19)

This is viewed as an estimator for diag(C).

Next, the off-diagonal elements c_(jk), for 1≤j<k≤n, are estimated asfollows:

1. Let a=(0, 0, . . . , 0) and b=(0, . . . , 0, 1, 0, . . . , 0, 1, 0, .. . , 0) (where the 1's appear in the j'th and k'th positions).

2. Construct an estimate {circumflex over (Γ)}_(ab) of the decay rateΓ_(ab)=2(c_(jj)+2c_(jk)+c_(kk)) and define

$h_{jk} = {{\frac{1}{4}{\overset{\hat{}}{\Gamma}}_{ab}} - {\frac{1}{2}g_{j}} - {\frac{1}{2}{g_{k}.}}}$

To write this in a compact form, define C′ to be the matrix C 400 withthe diagonal entries replaced by zeros,

$\begin{matrix}{C_{jk}^{\prime} = \{ \begin{matrix}c_{jk} & {{{ifj} \neq k},} \\0 & {{ifj} = {k.}}\end{matrix} } & ( {{Eq}.2.2} )\end{matrix}$

The “off-diagonal portion” of matrix C is 2 s sparse. An estimator forĈ′ is constructed, as follows:

$\begin{matrix}{{\overset{\hat{}}{C}}_{jk}^{\prime} = \{ \begin{matrix}h_{jk} & {{{ifj}\  < k},} \\h_{kj} & {{{ifj}\  > k},} \\0 & {{ifj}\  = {k.}}\end{matrix} } & ( {{Eq}.2.21} )\end{matrix}$

The accuracy of these estimators may be analyzed as follows. Initially,two parameters δ₁ and δ₂ are chosen. Suppose that, during themeasurement of the diagonal elements c_(jj), the decay rates Γ_(ab) areestimated with accuracy ∥{circumflex over (Γ)}_(ab)−Γ_(ab)∥_(ψ) ₂≤δ₁Γ_(ab), (Eq. 2.22) and during the measurement of the off-diagonalelements c_(jk), the decay rates Γ_(ab) are estimated with accuracy∥−Γ_(ab)∥_(ψ) ₂ ≤δ₂Γ_(ab). (Γ_(ab) (Eq. 2.23).

Bounds of the form (Eq. 2.22) and (Eq. 2.23) can be obtained fromequation (Eq. 2.10), by setting N_(trials)˜1/δ₁ ² and N_(trials)˜1/δ₂ ²,respectively. Counting those trials of the experiment that are used tochoose the evolution time t may be neglected, because the number ofthose trials grows only logarithmically with Γ_(ab). δ₁ and δ₂ may bedifferent, because in many experimental scenarios, measurements ofc_(jj) take less time than measurements of c_(jk). Using (Eq. 2.22) and(Eq. 2.23), it can be shown that bounds on the accuracy of g_(j) andh_(jk) are:

$\begin{matrix}{{{{g_{j} - c_{jj}}}_{\psi_{2}} \leq {\delta_{1}c_{jj}}},} & ( {{Eq}.2.24} )\end{matrix}$ $\begin{matrix}{{{h_{jk} - c_{jk}}}_{\psi_{2}} \leq {{\delta_{2}c_{jk}} + {\frac{1}{2}( {\delta_{1} + \delta_{2}} ){( {c_{jj} + c_{kk}} ).}}}} & ( {{Eq}.2.25} )\end{matrix}$

These bounds on the sub-Gaussian norm imply bounds on the moments, suchas

|X|≤∥X∥_(ψ) ₂ and

(X²)≤2∥X∥_(ψ) ₂ ².

The results to bound the accuracy of the estimators g and Ĉ′ may befollowed. For g, using the

₁ and

₂ norms, the bounds are:

(∥g−diag(C)∥₁)≤δ₁∥diag(C)∥₁,  (Eq. 2.26)

(∥g−diag(C)∥₂ ²)≤2δ₁ ²∥diag(C)∥₂ ².  (Eq. 2.27)

For Ĉ′, using the

₁ vector norm and the Frobenius matrix norm, the bounds are:

$\begin{matrix}{{{{\mathbb{E}}( {{{\hat{C}}^{\prime} - C^{\prime}}}_{\ell_{1}} )} \leq {2{\sum_{j < k}( {{\delta_{2}c_{jk}} + {\frac{1}{2}( {\delta_{1} + \delta_{2}} )( {c_{jj} + c_{kk}} )}} )}}} = {{{\delta_{2}{\sum_{j \neq k}c_{jk}}} + {\frac{1}{2}( {\delta_{1} + \delta_{2}} ){\sum_{j \neq k}( {c_{jj} + c_{kk}} )}}} \leq {( {n - 1} )( {\delta_{1} + \delta_{2}} ){{{{{{diag}(C)}_{\ell_{1}}} + {\delta_{2}{C^{\prime}}_{\ell_{1}}}},}}}}} & ( {{Eq}.2.28} )\end{matrix}$ $\begin{matrix}{{{\mathbb{E}}( {{{\hat{C}}^{\prime} - C^{\prime}}}_{F}^{2} )} \leq {2{\sum_{j < k}{2\lbrack {{\delta_{2}c_{jk}} + {\frac{1}{2}( {\delta_{1} + \delta_{2}} )( {c_{jj} + c_{kk}} )}} \rbrack}^{2}}} \leq {2{\sum_{j < k}{6\lbrack {{\delta_{2}^{2}c_{jk}^{2}} + {\frac{1}{4}( {\delta_{1} + \delta_{2}} )^{2}c_{jj}^{2}} + {\frac{1}{4}( {\delta_{1} + \delta_{2}} )^{2}c_{kk}^{2}}} \rbrack}}} \leq {{6\delta_{2}^{2}{C^{\prime}}_{F}^{2}} + {\frac{3}{2}( {\delta_{1} + \delta_{2}} )^{2}{\sum_{j \neq k}( {c_{jj}^{2} + c_{kk}^{2}} )}}} \leq {3( {n - 1} )( {\delta_{1} + \delta_{2}} )^{2}{{{{{diag}(C)}_{2}^{2}} + {6\delta_{2}^{2}{{C^{\prime}}_{F}^{2}.}}}}}} & ( {{Eq}.2.29} )\end{matrix}$

Note that, in the second step of (Eq. 2.29), the fact that for any realnumbers a₁, a₂, a₃, (a₁+a₂+a₃)²≤3(a₁ ²+a₂ ²+a₃ ²) was used.

These are bounds on the expected error of g and {tilde over (C)}′. Onecan then use Markov's inequality to prove bounds that hold with highprobability. For instance, using (Eq. 2.27), with probability at least1−η, results in

$\begin{matrix}{{{{g - {{diag}(C)}}}_{2} \leq {\frac{1}{\sqrt{\eta}}\sqrt{2}\delta_{1}{{{diag}(C)}}_{2}}},} & ( {{Eq}.2.3} )\end{matrix}$

Using (Eq. 2.29), with probability at least 1−η, results in:

$\begin{matrix}{{{{\overset{\hat{}}{C}}^{\prime} - C^{\prime}}}_{F} \leq {\frac{1}{\sqrt{\eta}}\lbrack {{3( {n - 1} )( {\delta_{1} + \delta_{2}} )^{2}{{{diag}(C)}}_{2}^{2}} + {6\delta_{2}^{2}{C^{\prime}}}} \rbrack}^{1/2} \leq {{\frac{1}{\sqrt{\eta}}\lbrack {{\sqrt{3}\sqrt{n - 1}( {\delta_{1} + \delta_{2}} ){{{diag}(C)}}_{2}} + {\sqrt{6}\delta_{2}{C^{\prime}}_{F}}} \rbrack}.}} & ( {{Eq}.2.31} )\end{matrix}$

In fact, one can prove tighter bounds on the failure probability, byreplacing Markov's inequality with a sharp concentration bound forsub-Gaussian random variables.

The bounds (Eq. 2.30) and (Eq. 2.31) give a rough sense of how well thisestimator performs. In particular, these bounds show that the error inestimating C′ depends on the magnitude of diag(C), as well as themagnitude of C′. This is due to the fact that the procedure forestimating the off-diagonal matrix element c_(jk) also involves thediagonal matrix elements c_(jj) and c_(jk). These bounds may be used asa baseline to understand the performance of the disclosed compressedsensing estimators.

FIGS. 7A and 7B are tables illustrating sample complexity of differentmethods for reconstructing a correlation matrix C 400 (FIG. 6 ), of sizen×n, with 2 s nonzero elements off the diagonal. In aspects, the naivemethod is to measure each element of C separately. “RIP-based” and“RIPless” refer to different analytical bounds on the accuracy of thereconstruction of matrix C 400 (FIG. 6 ). The asterisk (*) indicatesthat the results using the RIPless bound hold under a technicalassumption that the diagonal of C does not have any unusually largeelements. The different methods are parameterized in such a way thatthey reconstruct the diagonal of matrix C 400 (FIG. 6 ) up to anadditive error of size δ∥diag(C)∥₂, and they reconstruct theoff-diagonal part of C up to an additive error of size δ∥C∥_(F). Eachmethod makes use of single-qubit spectroscopy (with n experimentalconfigurations or “measurement settings”), as well as multi-qubitspectroscopy (with m measurement settings, where m varies between m∝n²and m∝s log n). The total sample complexity, shown in the tables ofFIGS. 7A and 7B, include both single-qubit and multi-qubit spectroscopy.For the compressed sensing method, the number of measurement settingscan be as low as m∝s log n, using the RIPless bound, but the best samplecomplexity is achieved when m∝s log⁴n, using the RIP-based bound. Thissample complexity, O(max(n, s)² log⁴(n)), compares favorably with thesample complexity of the naive method, which is O(n³). Thus, Compressedsensing has an advantage over the naive method whenevers≤O(n^(3/2)/log²n).

The performance of the disclosed compressed sensing method may becompared to the naive method where one measures each element of Cindependently. A rigorous comparison may be made based on the errorbounds for each of these methods. These can be summarized as follows:

Naive:∥Ĉ−C∥ _(F) ≤O(√{square root over (n)}(δ₁+δ₂)∥diag(C)∥₂+δ₂ ∥C′∥_(F)),  (Eq. 1.4)

RIP-based:∥W ^((opt)) −C∥ _(F) ≤O(√{square root over(n)}δ₁∥diag(C)∥₂+δ₂(√{square root over (n)}∥diag(C)∥₂+√{square root over(2s)}∥C′∥ _(F))),  (Eq. 1.5)

RIPless:∥W ^((opt)) −C∥ _(F) ≤O(s log^(5/2)(n)[δ₁∥diag(C)∥_(∞)√{squareroot over (sn log)}(n)+δ₂√{square root over (n)}∥diag(C)∥₂+δ₂√{squareroot over (2s)}∥C′∥ _(F)]).  (Eq. 1.6)

Here Ĉ is the estimate of C using the naive method, and W^((opt)) is theestimate of C using compressed sensing. diag(C) denotes the diagonal ofC, and C′ denotes the off-diagonal part of C. m∝s log⁴n is set for theRIP-based bound, and m∝s log n for the RIPless bound. (2.30) may be usedto bound the error in estimating diag(C), and (Eq. 2.31), (Eq. 5.48) and(Eq. 5.39) to bound the error in estimating C′. Finally, δ₁ and δ₂ areparameters that control the accuracy of the single-qubit and multi-qubitspectroscopy procedures. Using the disclosed method rigorous bounds onthe sample complexity may be proven that are required for each method toreconstruct C with comparable accuracy.

For the compressed sensing method, the number of measurement settingscan be as low as m∝s log n, using the RIPless bound, but the best samplecomplexity is achieved when m∝s log⁴n, using the RIP-based bound. Thissample complexity, O(max(n, s)² log⁴(n)), compares favorably with thesample complexity of the naive method, which is O(n³). Thus, compressedsensing has an advantage over the naive method whenevers≤O(n^(3/2)/log²n).

The estimation of the decay rate Γ_(ab)=2r^(T)Cr may be performed withhigh precision (i.e., with small multiplicative error), by allowing thequantum system 100 to evolve for a time t˜1/Γ_(ab). Here the time t ischosen by an adaptive procedure that starts with an initial guess τ₀ andconverges (provably) after˜|log(Γ_(ab) τ₀)| steps (see FIG. 3 ). Toachieve high precision, the time evolution operator

is treated exactly.

The above-noted feature is helpful in experimental setups wheremeasurements are time-consuming and the values of Γ_(ab) span severalorders of magnitude, thus making it difficult to determine theappropriate evolution time and accurately measure the decay rates usingconventional methods. This feature can also be used as a standalonetechnique to estimate the relaxation (T₁) and decoherence (T₂) times ofany quantum system 100.

Since the disclosed method relies on estimating exponential decay rates,it can be made at least partially robust tostate-preparation-and-measurement (SPAM) errors (see FIGS. 9A and 9B).This follows some of the same approaches used in randomized benchmarkingand gate set tomography. The estimated decay rate may be used toestimate a relaxation time of the quantum system 100. In aspects, adecoherence time of the quantum system 100 may be estimated based on theestimated decay rate.

The disclosed method enables a unitary evolution according to someHamiltonian H_(S), and enables the matrix C 400 (FIG. 6 ) to havecomplex entries c_(jk). A similar approach can be used to estimate theHamiltonian H_(S), as well as the imaginary part of c_(jk). Thedisclosed technology measures correlated dephasing and not Pauli errors.In aspects, the measurement may include Ramsey spectroscopy. In aspectsthe reconstruction algorithm may include, for example, convexoptimization over a continuous domain.

The disclosed method assumes that the noise is sparse, meaning that thenoise is supported on a subset S of the domain, where S is small, butunknown.

It is contemplated that the disclosed method for SPAM-robust estimationof sparse noise models, may be used to characterize other kinds ofcorrelated noise processes in many-body quantum systems.

The disclosed technology includes an efficient method for learning theoff-diagonal part of the correlation matrix C, under the assumption thatit is sparse, i.e., the part of C that lies above the diagonal has atmost s nonzero elements, where s<<n(n−1)/2. (Since C is Hermitian, it issufficient to learn the part that lies above the diagonal; this thendetermines the part that lies below the diagonal.)

For simplicity, first consider the special case where the entries in thematrix C are real (i.e., with zero imaginary part), which occurs in anumber of physical situations (for example, when the system is coupledto a bath at a high temperature.

The disclosed method consists of two steps: first, single-qubit Ramseyspectroscopy is performed in order to learn the diagonal elements of C;second, techniques from compressed sensing (e.g., random measurements,and

₁-minimization) are applied in order to recover the off-diagonalelements of C. In aspects, a long-range correlated dephasing error maybe detected based on the recovered matrix.

In aspects, random measurements of the correlation matrix may beperformed. Initially, each of the diagonal elements c_(jj), for j=1, . .. , n, are estimated using single-qubit Ramsey spectroscopy, asillustrated in FIG. 5 . Let g=(g₁, . . . , g_(n))∈

^(n) be the output of this procedure. g may be viewed as an estimate ofa “sensing operator” that returns the diagonal elements of the matrix C400,

g≈diag(C)=(c ₁₁ ,c ₂₂ , . . . ,c _(nn)).  (Eq. 3.1)

(Note that c_(jj)≥0, since C is positive semidefinite.)

In order to estimate the off-diagonal part of C, a compressed sensingtechnique is used, which involves a certain type of generalized Ramseymeasurement with random GHZ-type states, see FIG. 4 . First, a parameterm is chosen, which can be roughly m˜s log n or m˜s log⁴n, which controlsthe number of different measurements. The particular choice of m may bemotivated by a theoretical recovery guarantee. Now, for j=1, . . . , m,perform the following procedure:

1. Choose vectors a, b∈{0,1}^(n) uniformly at random. As in equation(2.4), define

r=b−a.  (Eq. 3.2)

2. Prepare the state

$  { { {❘\psi_{ab}} \rangle = {\frac{1}{\sqrt{2}}( {❘a} }} \rangle + {❘b}} \rangle ).$

This is a GHZ state on a subset of the qubits, with some bit flips. Itcan be created by preparing those qubits i where a_(i)=b_(i) in thestate |a_(i)

, preparing those qubits i where a_(i)≠b_(i) in a GHZ state, andapplying a Pauli X operator on those qubits i where a_(i)>b_(i). (Thisrequires a quantum circuit of depth ┌log₂(n)┐+2.)

3. Construct an estimate {circumflex over (Γ)}_(ab) of the decay rateΓ_(ab)=2r^(T)Cr. Define h_(j)={circumflex over (Γ)}_(ab).

Let h=(h₁, . . . , h_(m))∈

^(m) be the output of the above procedure. Again, h may be viewed as anestimate of a “sensing operator” Φ:

^(n×n)→

^(m) that is applied to the matrix C 400 (FIG. 6 ),

h≈Φ(C)=[Φ_(j)(C)]_(j=1, . . . ,m)  (Eq. 3.3)

Φ_(j)(C)=2(r ^((j)))^(T) Cr ^((j))  (Eq. 3.4)

where r⁽¹⁾, r⁽²⁾, . . . , r^((m))∈{1,0,−1}^(n) are independent randomvectors chosen from the same distribution as r (described above). Notethat Φ_(j)(C)≥0, since C is positive semidefinite. The factor of 2 ischosen to ensure that Φ has a certain isotropy property, as describedherein.

In aspects, the correlation matrix C∈

^(n×n) may be reconstructed. C is positive semidefinite, due to physicalconstraints, and its off-diagonal part is sparse, with at mosts<<n(n−1)/2 nonzero elements above the diagonal. In general, thissparsity constraint leads to an optimization problem that iscomputationally intractable. However, in this particular case, thisproblem can be solved using a strategy from compressed sensing: giveng≈diag(C)∈

^(n) and h≈Φ(C) ∈

^(m), matrix C 400 is recovered by solving a convex optimizationproblem, where the

₁ (vector) norm of the off-diagonal part of the matrix is minimized.This strategy succeeds when m≥c₀s log n, and is highly robust to noisewhen m≥c₀s log⁴n, where c₀ is some universal constant.

Initially, the case where the measurements are noiseless, i.e.,g=diag(C) and h=Φ(C) is considered. The following convex optimizationproblem is solved:

$\begin{matrix}{{{{Find}W} \in {{\mathbb{R}}^{n \times n}{}{{that}{minimizes}}{\sum_{i \neq j}{❘W_{ij}❘}}}},} & ( {{Eq}.3.5} )\end{matrix}$ suchthat: $\begin{matrix}{{{{diag}(W)} = g},} & ( {{Eq}.3.6} )\end{matrix}$ $\begin{matrix}{{{\Phi(W)} = h},} & ( {{Eq}.3.7} )\end{matrix}$ $\begin{matrix}{W \succeq 0.} & ( {{Eq}.3.8} )\end{matrix}$

Here, W

0 means that W is positive semidefinite, which implies that W=W^(T). Asa sanity check, note that W=C is a feasible solution to this problem.(Recall that C is positive semidefinite.)

In phase retrieval, one wishes to estimate an unknown vector x frommeasurements of the form |r^(T)x|². In aspects, a PhaseLift algorithmworks by “lifting” the unknown vector x to a matrix X=xx^(T), so thatthe problem becomes one of learning a rank-1 matrix X from measurementsof the form r^(T)Xr; then one solves a convex relaxation of thisproblem. It is contemplated that in cases where the unknown vector x issparse (“compressive phase retrieval”), variants of the PhaseLiftalgorithm (as well as other approaches) can also be used.

In the disclosed method, the unknown matrix C is almost always full-rank(because every qubit has a nonzero rate of dephasing). Physicalconstraints imply that C is positive semidefinite, so it can be factoredas C=BB^(T), which is superficially similar to X=xx^(T); however, animportant difference is that B is a square matrix, whereas x is avector.

In aspects, the matrix C 400 may be reconstructed from noisymeasurements. In the case where the measurements of g and h are noisy,the above convex optimization problem is modified, by relaxing theconstraints (Eq. 3.6) and (Eq. 3.7). This leads to some technicalcomplications, due to the fact that two variables are beingreconstructed that have different characteristics: the diagonal part ofC (which is not sparse), and the off-diagonal part of C (which issparse).

To deal with these issues, two different ways of performing thisreconstruction, when the measurements are noisy are: (1) simultaneousreconstruction of both parts of C, and (2) sequential reconstruction ofthe diagonal part of C, followed by the off-diagonal part of C. Theformer approach is arguably more natural, but the latter approachenables more rigorous analysis of the accuracy of the reconstruction.

Assuming the following bounds on the

₂ norms of the noise terms (u and v), that is,

g=diag(C)+u,∥u∥ ₂≤∈₁,  (Eq. 3.9)

h=Φ(C)+v,∥v∥ ₂≤∈₂.  (Eq. 3.10)

Simultaneous reconstruction of the diagonal and off-diagonal parts of C:the constraints (Eq. 3.6) and (Eq. 3.7) are relaxed in the simplestpossible way, by replacing them with:

∥diag(W)−g∥ ₂≤∈₁,  (Eq. 3.11)

∥Φ(W)−h∥ ₂≤∈₂.  (Eq. 3.12)

This leads to a convex optimization problem that attempts to reconstructboth the diagonal part of C, which is not necessarily sparse, and theoff-diagonal part of C, which is assumed to be sparse. (Note that W=C isa feasible solution to this problem.)

The reconstruction algorithm involves two different estimators (an

₁-regularized estimator for the off-diagonal part of C, and aleast-squares estimator for the diagonal part of C). These twoestimators are coupled together (through the constraint on Φ(W), and thepositivity constraint W

0).

Therefore, this method can behave quite differently, depending onwhether the dominant source of error is g or h. When g is the dominantsource of error, this method will behave like a least-squares estimator,whose accuracy scales according to the dimension n; when h is thedominant source of error, this method will behave like an

₁-regularized estimator, whose accuracy scales according to the sparsitys (neglecting log factors). From a theoretical point of view, this makesit more difficult to prove recovery guarantees for this method.

Sequential reconstruction of the diagonal part of C, followed by theoff-diagonal part of C: In practice, one is often interested in theregime where g is known with high precision, and h is the dominantsource of error. This is because measurements of g are relatively easyto perform, because they only require single-qubit state preparationsand measurements; whereas measurements of h are more costly, becausethey require the preparation and measurement of entangled states on manyqubits. So measurements of g can often be performed more quickly, andmeasurements on different qubits can be performed simultaneously inparallel; hence one can repeat the measurements more times, to obtainmore accurate estimates of g.

In this regime, it is natural to try to recover the diagonal part of Cdirectly from g, and then use

₁-minimization to recover only the off-diagonal part of C. This leads toa convex optimization problem which is arguably less natural, but itmakes it easier to prove rigorous guarantees on the accuracy of thereconstruction of C. Starting from the convex optimization problem ((Eq.3.5)-(Eq. 3.8)) for the noiseless case, and the last two constraints arerelaxed to get:

$\begin{matrix}{{{{Find}W} \in {{\mathbb{R}}^{n \times n}{}{{that}{minimizes}}{\sum_{i \neq j}{❘W_{ij}❘}}}},} & ( {{Eq}.3.13} )\end{matrix}$ suchthat: $\begin{matrix}{{{{diag}(W)} = g},} & ( {{Eq}.3.14} )\end{matrix}$ $\begin{matrix}{{{{\Phi(W)} - h}}_{2} \leq {\epsilon_{2} + {\epsilon_{1}\sqrt{{mn},}}}} & ( {{Eq}.3.15} )\end{matrix}$ $\begin{matrix}{{{d( {W,K_{+}} )} \leq \epsilon_{1}},{{{and}W} = {W^{T}.}}} & ( {{Eq}.3.16} )\end{matrix}$

Here K₊={W′∈

^(n×n)|W′

0} denotes the (real) positive semidefinite cone, and

$\begin{matrix}{{d( {W,K_{+}} )} = {\min\limits_{W^{\prime} \in K_{+}}{{W - W^{\prime}}}_{F}}} & ( {{Eq}.3.17} )\end{matrix}$

to be the minimum distance from W to a point W′ in K₊, measured inFrobenius norm; note that this is a convex function. While this convexoptimization problem looks complicated, it follows from a simpleunderlying idea: since the diagonal elements of W are fixed by theconstraint (Eq. 3.14), this is simply an

₁-regularized estimator for the sparse, off-diagonal part of W.

The attentive reader will notice two potential concerns with thisapproach. First, in general, C will not be a feasible solution to thisconvex optimization problem, since the diagonal elements of C will notsatisfy (Eq. 3.14). However, C lies close to a feasible solution. To seethis, let {tilde over (C)} be the matrix whose off-diagonal elementsagree with C, and whose diagonal elements agree with g. Then C is withindistance ∈₁ of {tilde over (C)} (in Frobenius norm), and {tilde over(C)} is a feasible solution. To see this, check that {tilde over (C)}satisfies the constraints (Eq. 3.14), (Eq. 3.15) and (Eq. 3.16), since:

$\begin{matrix}\begin{matrix}{{{{\Phi( \overset{\sim}{C} )} - {\Phi(C)}}}_{2} = {{\Phi( {{diag}( {g - {{diag}(C)}} )} )}}_{2}} \\{\leq ( {m{{g - {{diag}(C)}}}_{1}^{2}} )^{1/2}} \\{\leq {\epsilon_{1}\sqrt{{mn}.}}}\end{matrix} & (3.18)\end{matrix}$

(Here, {tilde over (C)} is written in a compact form, {tilde over(C)}=C+diag (g−diag(C)), where the diag(·) notation has the followingmeaning: for a matrix M, diag (M) is the vector containing the entriesM_(jj) that lie along the diagonal of M; and for a vector v=(v₁, . . . ,v_(n)), diag(v) is the diagonal matrix with v₁, . . . , v_(n) along thediagonal.)

Second, the optimal solution W may violate the positivity constraint(Eq. 3.8), making it un-physical/non-physical. (Similar issues can arisewhen performing quantum state and process tomography.) However, W can beeasily corrected to get a physically-admissible solution. This followsbecause equation (3.16) shows that W lies within distance ∈₁ of aphysically-admissible solution W′

0, and this solution W′ can be obtained by truncating the negativeeigenvalues of W.

Finally, there are different ways of relaxing the positivity constraint(Eq. 3.8), and (Eq. 3.16) is not the strongest possible choice. Forinstance, a stronger constraint could be used than (Eq. 3.16), such as:d′(W,K₊)≤∈₁, where d′(W,K₊)=min_(W′∈K) ₊ ∥diag (W−W′)∥₂. However, theconstraint (Eq. 3.16) may be simpler to implement using numerical linearalgebra software.

Since these are convex optimization problems, they can be solvedefficiently (both in theory and in practice), for instance by usinginterior point algorithms. Nonetheless, some care is needed to ensurethat these algorithms can scale up to solve very large instances ofthese problems. In particular, enforcing the positivity constraint (Eq.3.8), and its relaxed version (Eq. 3.16), can be computationallyexpensive.

In aspects, the positivity constraint may be omitted. The matrix C 400can be reconstructed by solving the convex optimization problem (Eq.3.13)-(Eq. 3.16). This analysis holds even without the positivityconstraint (Eq. 3.16). In aspects, it is easy to check that thepositivity constraint is not used, and indeed, most of the theory ofcompressed sensing applies to all sparse signals, not just positiveones, although positivity can be helpful in certain situations.

This observation has a practical consequence: by omitting the positivityconstraint (Eq. 3.16), one can make the convex optimization problemsimpler, and thus easier to solve in practice (e.g., by usingsecond-order cone programming, rather than semidefinite programming).One can then take the resulting solution, and project it onto thepositive semidefinite cone, as is sometimes done in quantum statetomography, without increasing the error (in Frobenius norm). Thistechnique is useful for scaling up the disclosed method to extremelylarge numbers of qubits.

Next, setting the Error Parameters ∈₁ and ∈₂ is described. To set theparameters ∈₁ and ∈₂ in equations (3.9) and (3.10): First, consider ∈₁,which bounds u, the error in g. Note that, when g_(j) is estimated usingthe above noted procedure, large-deviation bounds on u_(j) are obtained.In particular, for some δ₁>0, ∥u_(j)∥_(ψ) ₂ ≤δ₁c_(jj), (Eq. 3.19)

where ∥·∥_(ψ) ₂ is the sub-Gaussian norm. This bound can be obtainedfrom equation (2.10), by setting N_(trials)˜1/δ₁ ². The trials of theexperiment that are used to choose the evolution time t may be neglectedin the count, because the number of those trials grows onlylogarithmically with Γ_(ab).

This implies that ∥u∥₂ ² is a subexponential random variable, whosesubexponential norm is at most 2δ₁ ²∥diag(C)∥₂ ². This implies that ∥u∥₂is bounded with high probability: for any τ≥1,Pr[∥u∥₂≥τδ₁∥diag(C)∥₂]≤e·exp(−τ²/2c), (Eq. 3.20)

where c>0 is some universal constant. τ may be chosen to be asufficiently large constant, so that the failure probability is small.(Note that in some cases, one can prove stronger bounds, by takingadvantage of the fact that the coordinates of u are independent, andusing a Bernstein-type inequality. This bound is stronger when thediagonal elements of C satisfy∥diag(C)∥_(∞)<<∥diag(C)∥₂, i.e., when uhas many independent coordinates with similar magnitudes.)

The above bound does not immediately tell how to set ∈₁, because thebound depends on diag(C), which is not known exactly. Instead, a boundthat depends on g is derived, which is known explicitly, and can be usedto set ∈₁. To do this, assume that δ₁ is sufficiently small so thatτδ₁<1/4. With high probability:

$\begin{matrix}\begin{matrix}{{u}_{2} \leq {{\tau\delta}_{1}{{{diag}(C)}}_{2}}} \\{\leq {{\tau\delta}_{1}( {{g}_{2} + {u}_{2}} )}} \\{{\leq {\frac{{\tau\delta}_{1}}{1 - {\tau\delta}_{1}}{g}_{2}\text{=:}\epsilon_{1}}},}\end{matrix} & (3.21)\end{matrix}$

Using the triangle inequality, and some algebra shows how to set ∈₁ sothat equation (3.9) holds.

∈₁ may also be bounded in terms of diag(C), as follows:

$\begin{matrix}\begin{matrix}{\epsilon_{1} \leq {\frac{{\tau\delta}_{1}}{1 - {\tau\delta}_{1}}( {{{{{diag}(C)}_{2}} + {u}_{2}}} )}} \\{\leq {\frac{{\tau\delta}_{1}}{1 - {\tau\delta}_{1}}( {1 + {\tau\delta}_{1}} ){{{diag}(C)}}_{2}}} \\{\leq {2{\tau\delta}_{1}{{{{diag}(C)}}_{2}.}}}\end{matrix} & ( {{Eq}.3.22} )\end{matrix}$

-   -   In aspects, this bound may be used to analyze the accuracy of        the estimate of matrix C 400. ∈₂ bounds v, which is the error        in h. The same approach as above may be used. When h_(j) is        estimated a bound on the sub-Gaussian norm of v_(j) is obtained:        for some δ₂>0,

∥v _(j)∥_(ψ) ₂ ≤δ₂Φ_(j)(C).  (Eq. 3.23)

(This bound can be obtained from equation (2.10), by settingN_(trials)˜1/δ₂ ². δ₂ to may be different from δ₁, because themeasurements used to estimate h_(j) are more costly than themeasurements used to estimate g_(j), hence one may prefer to usedifferent values for N_(trials) in each case.)

This implies that ∥v∥₂ ² is a subexponential random variable, hence ∥v∥₂is bounded with high probability: for any τ≥1,

Pr[∥v∥ ₂≥τδ₂∥Φ(C)∥₂]≤e·exp(−τ₂/2c),  (Eq. 3.24)

where c>0 is some universal constant, choose τ to be a sufficientlylarge constant, so that the failure probability is small. (Note that,for typical choices of Φ(·), it is expected that ∥Φ(C)∥_(∞)<<∥Φ(C)∥₂.This implies that v has many independent coordinates with similarmagnitudes. When this occurs, one can prove a stronger bound using aBernstein-type inequality.

The above bound does not immediately show how to set ∈₂, because thebound depends on Φ(C), which is not known exactly. Instead, derive abound that depends on h, which is known explicitly, and can be used toset ∈₂. To do this, it is assumed that δ₂ is sufficiently small so thatτδ₂<1/4. With high probability:

$\begin{matrix}\begin{matrix}{{v}_{2} \leq {{\tau\delta}_{2}{{\Phi(C)}}_{2}}} \\{\leq {{\tau\delta}_{2}( {{h}_{2} + {v}_{2}} )}} \\{\leq {\frac{{\tau\delta}_{2}}{1 - {\tau\delta}_{2}}{h}_{2}\text{=:}{\epsilon_{2}.}}}\end{matrix} & (3.25)\end{matrix}$

This indicates how to set ∈₂ so that equation (3.10) holds. Finally, ∈₂can also be bounded in terms of ∥C

₁, as follows:

$\begin{matrix}\begin{matrix}{\epsilon_{2} \leq {\frac{{\tau\delta}_{2}}{1 - {\tau\delta}_{2}}( {{{\Phi(C)}}_{2} + {v}_{2}} )}} \\{\leq {\frac{{\tau\delta}_{2}}{1 - {\tau\delta}_{2}}( {1 + {\tau\delta}_{2}} ){{\Phi(C)}}_{2}}} \\{\leq {2{\tau\delta}_{2}{{\Phi(C)}}_{2}}} \\{\leq {2{\tau\delta}_{2}\sqrt{m}{{\Phi(C)}}_{\infty}}} \\{\leq {4{\tau\delta}_{2}\sqrt{m}{C}_{\ell_{1}.}}}\end{matrix} & (3.26)\end{matrix}$

In aspects, this bound may be used when analyzing the accuracy of theestimate of C.

Referring to FIGS. 8A-8C, graphs that illustrating scaling of thereconstruction error ∥W^((opt))−C∥_(∞) under various circumstances areshown. Here W^((opt)) denotes the estimate of C obtained via compressedsensing. The solid lines are the average errors over 100 randominstances of the problem, and the shaded region is their 95% confidenceinterval obtained by bootstrapping. FIG. 8A illustrates thereconstruction error as a function of the number of measurement settingsm (assuming noiseless measurements) for various values of sparsity s(and n=64 qubits). The errors go through a phase transition whoselocation m_(c) scales linearly with s. FIG. 8B illustrates thereconstruction error as a function of the number of measurement settingsm (assuming noiseless measurements) for various numbers of qubits n (andsparsity s=12). The phase transition point m_(c) scales logarithmicallywith n. FIG. 8C illustrates the reconstruction error as a function ofthe number of measurement settings m, for different values of addednoise strength σ_(∈), with fixed parameters (n, s)=(64,12). The insetshows that the recovery errors (when m>m_(c), i.e., after the transitionpoint) scale linearly with σ_(∈), as expected.

Numerical simulations are used to test the performance of the disclosedmethod on realistic system sizes, with different levels of sparsity, andwhen the data contain statistical fluctuations due to finite samplesizes. As shown in FIGS. 8A-8C, the disclosed method performs welloverall.

The protocol for randomly chosen C matrices is numerically simulated. Inthese examples it is assumed that the diagonal elements of C are known,that is, ∈₁=0 in Eq. (3.11). The convex optimization problem given byequations (3.5)-(3.7) is solved using, for example, CVXPY, a convexoptimization package for Python®.

The case of noiseless measurements, corresponding to ∈₂=0 in Eq. (3.12)is investigated. In FIG. 8A the recovery error is shown as a function ofthe number of measurements, m, for a fixed number of qubits, n, andvarious choices of the off-diagonal sparsity, s. The sharp transition inthe recovery error as a function of m is evident. Moreover, as shown inthe inset of FIG. 8A, the transition point m_(c), which is defined asthe point where ∥C−W^((opt))∥_(∞) drops below 0.25, scales linearly withs, consistent with the analytical results. In FIG. 8B s is fixed, n isvaried, and the recovery error as a function of m is studied. A phasetransition may be observed as m increases. In this case, m_(c) scalespolynomially with log(n) as suggested in the inset of FIG. 8B.

The effect of noisy measurements on the recovery error is investigated.Random C matrices may be generated, with a fixed number of qubits n andsparsity s. Noise is simulated by adding a random vector e, whoseentries are independent Gaussian random variables with mean 0 andstandard deviation σ_(∈), to measurement vector h. Eq. (3.7) may bereplaced in the previous convex program with Eq. (3.12) and choose∈₂=√{square root over (m)}σ_(∈). The scaling of the reconstruction error∥W^((opt))−C∥_(∞) as a function of σ_(∈) is shown in FIG. 8C. Therecovery error after the phase transition point scales linearly withσ_(∈), consistent with the proposed analytical bounds.

The convex optimization problem (Eq. 3.13)-(Eq. 3.16) is used to proverigorous recovery guarantees that show that the optimal solutionW^((opt)) is close to the true correlation matrix C 400 (FIG. 6 ),provided that m≥c₀s log n (and with better robustness to noise, whenm≥c₀s log⁴n). Here, m is the dimension of the measurement vector h, s isthe sparsity (the number of nonzero elements) in the off-diagonal partof the matrix C, and c₀ is some universal constant.

In aspects, two different results are proven: a non-universal recoveryguarantee, using the “RIPless” framework, as well as a universalrecovery guarantee, using RIP-based techniques. Here, RIP refers to the“restricted isometry property,” a fundamental proof technique incompressed sensing. There are different advantages to the RIPless andRIP-based bounds: the RIPless bounds require slightly fewermeasurements, while the RIP-based bounds are more robust when themeasurements are noisy.

In aspects, two variants of the problem (Eq. 3.13)-(Eq. 3.16) areintroduced: constrained

₁-minimization and the LASSO (“least absolute shrinkage and selectionoperator”). Generally speaking, recovery guarantees that hold for one ofthese problems can be adapted to the other one, with minormodifications.

To simplify the problem, start with the convex optimization problem (Eq.3.13)-(Eq. 3.16). |Remove the positivity constraint d(W,K₊)≤∈₁. Changethe objective function to sum over all i<j rather than all i≠j; since Wis symmetric, this merely changes the objective function by a factor of2. Finally, shift the variable W by subtracting away diag(g), so thatits diagonal elements are all zero. Similarly, shift the measurementvector h to get h′=h−Φ(diag(g)). (Eq. 5.1)

This provides an equivalent problem:

$\begin{matrix}{{{{Find}W} \in {{\mathbb{R}}^{n \times n}{}{{that}{minimizes}}{\sum_{i < j}{❘W_{ij}❘}}}},} & ( {{Eq}.5.2} )\end{matrix}$ suchthat: $\begin{matrix}{{{{diag}(W)} = 0},} & ( {{Eq}.5.3} )\end{matrix}$ $\begin{matrix}{{{{\Phi(W)} - {h^{\prime}_{2}}} \leq {\epsilon_{2} + {\epsilon_{1}\sqrt{{mn},}}}}} & ( {{Eq}.5.4} )\end{matrix}$ $\begin{matrix}{W = {W^{T}.}} & ( {{Eq}.5.5} )\end{matrix}$

An operation diag(·) that has two meanings: given an n×n matrix M,diag(M) returns an n-dimensional vector containing the diagonal elementsof M; and given an n-dimensional vector v, diag(v) returns an n×n matrixthat contains v along the diagonal, and zeros off the diagonal.

Define C′ to be the off-diagonal part of the correlation matrix C, thatis, C′ is the matrix whose off-diagonal elements match those of C, andwhose diagonal elements are zero.

C′=C−diag(diag(C)).  (Eq. 5.6)

h′ may be viewed as a measurement of C′, with additive error z,h′=Φ(C′)+z. (Eq. 5.7)

The solution W^((opt)) is an accurate estimate of C′. The error term zmay be written in the form:

$\begin{matrix}\begin{matrix}{z = {h^{\prime} - {\Phi( C^{\prime} )}}} \\{= {h - {\Phi(C)} - {\Phi( {{diag}( {g - {{diag}(C)}} )} )}}} \\{{= {v - {\Phi( {{diag}(u)} )}}},}\end{matrix} & (5.8)\end{matrix}$

where u and v are the noise terms in (Eq. 3.9) and (Eq. 3.10). next,bound z using (Eq. 3.10) and (Eq. 3.18), ∥z∥₂≤∈₂+∈₁√{square root over(mn)}. (Eq. 5.9)

It will be convenient to write

D={W∈

^(n×n) |W ^(T) =W,diag(W)=0}  (Eq. 5.10)

to denote the subspace of symmetric matrices whose diagonal elements areall 0. Let uvec: D→

^(n(n−1)/2) denote the linear operator that returns the upper-triangularpart of W, uvec: W

(W_(ij))_(i<j). (Eq. 5.11)

Write Φ_(D): D→

^(m) to denote the measurement operator Φ restricted to act on thesubspace D. The problem (Eq. 5.2)-(Eq. 5.5) may be rewritten in a moreconcise form:

$\begin{matrix}{{{{Find}{}W} \in {D{}{that}{minimizes}{{{uvec}(W)}}_{1}}},} & ( {{Eq}.5.12} )\end{matrix}$ suchthat: $\begin{matrix}{{{{\Phi_{D}(W)} - h^{\prime}}}_{2} \leq {\epsilon_{2} + {\epsilon_{1}\sqrt{{mn}.}}}} & ( {{Eq}.5.13} )\end{matrix}$

In aspects, a variant of the problem includes the LASSO:

Find W∈D that minimizes

$\begin{matrix}{\frac{1}{2}{{{\Phi_{D}(W)} - {h^{\prime}_{2}^{2}} + \lambda}}{{uvec}(W)}{_{1}.}} & ( {{Eq}.5.14} )\end{matrix}$

This can be viewed as a Lagrangian relaxation of the previous problem((Eq. 5.12)-(Eq. 5.13)), or as an

₁-regularized least-squares problem. In addition, the convexoptimization problems that were described earlier can also be relaxedinto a LASSO-like form, in a similar way. The choice of theregularization parameter λ requires some care.

In aspects, the Regularization Parameter λ may be set by the followingprocedure. In general, the regularization parameter λ controls therelative strength of the two parts of the objective function in. Whenthe noise in the measurement of h′ is strong, then λ must be set largeenough to ensure that the

₁ regularization term still has the desired effect. However, if A is toolarge, it strongly biases the solution W^((opt)), making it lessaccurate.

An approach to setting λ, may be to have a goal that is to ensure thatthe solution W^((opt)) converges to (a sparse approximation of) the truecorrelation matrix C. To do this, λ must be set large enough to satisfytwo constraints, which involve the noise in the measurement of h′ (seeequation (4.1) and the equation below (4.2)). When these constraints aresatisfied, the error in the solution W^((opt)) is bounded by equation(4.2). (Note that this error bound grows with 2L, hence one shouldchoose the smallest value of λ that satisfies the above constraints.)

In order to set 2L, first, precise statements of the two constraints onλ are provided:

∥uvec(Φ_(D) ^(†) z)∥_(∞)≤λ,  (Eq. 5.15)

∥uvec(Φ_(D,T) _(c) ^(†)(I−P)z∥ _(∞)≤λ.  (Eq. 5.16)

Here, z is the noise term in the measurement of h′ in equation (5.7);Φ_(D) ^(†):

^(m)→D is the adjoint of the measurement operator Φ_(D);T⊂{(j,j′)|1≤j<j′≤n} is the support of (a sparse approximation of) thetrue correlation matrix C; Φ_(D,T) is the sub-matrix of Φ_(D) thatcontains those columns of Φ_(D) whose indices belong to the set T; P isthe projection onto the range of Φ_(D,T); T^(c) is the complement of theset T; and Φ_(D,T) _(c) is the sub-matrix of Φ_(D) that contains thosecolumns of Φ_(D) whose indices belong to the set T^(c).

In order to set λ, compute the quantities in equations (5.15) and(5.16), and to do this, some bounds on the noise z are needed. Two waysof obtaining such bounds include:

One straightforward way is as follows. Equation (5.8) can be used towrite

z=v−Φ(diag(u)),  (Eq. 5.17)

where u and v are the noise terms in the measurements of g and h,respectively. Bounds on u and v (see equations (3.21) and (3.25)) implybounds on z, via an elementary calculation. However, one can get betterbounds on z by using a more sophisticated approach, starting with boundson the sub-Gaussian norms of u_(j) and v_(j), such as equations (3.19)and (3.23). Assume that g and h are measured using the proceduresdescribed previously. Then equations (3.19) and (3.23) provide bounds onthe sub-Gaussian norms of u_(j) and v_(j):

∥u _(j)∥₁₀₄ ₂ ≤δ₁ c _(jj),  (Eq. 5.18)

∥v _(j)∥₁₀₄ ₂ ≤δ₂Φ_(j)(C).  (Eq. 5.19)

Using these bounds, set λ so that it satisfies equations (5.15) and(5.16) with high probability. More precisely, set

$\begin{matrix}{{\lambda:={( {{{\epsilon_{1}^{\prime''} \cdot 4}\sqrt{mn}} + \epsilon_{2}^{\prime''}} )4\sqrt{m}( {1 + \sqrt{2}} )\sqrt{{\ln(n)}/c^{\prime}}}},} & ( {{Eq}.5.2} )\end{matrix}$ $\begin{matrix}{{{{where}\epsilon_{1}^{\prime''}}:={\frac{\delta_{1}}{1 - \epsilon_{1}^{''}}{g}_{\infty}}},{\epsilon_{1}^{''}:={2\sqrt{{\ln(n)}/c_{0}}\delta_{1}}},} & ( {{Eq}.5.21} )\end{matrix}$ $\begin{matrix}{{\epsilon_{2}^{\prime''}:={\frac{\delta_{2}}{1 - \epsilon_{2}^{''}}{h}_{\infty}}},{\epsilon_{2}^{''}:={2\sqrt{{\ln(m)}/c_{0}}{\delta_{2}.}}}} & ( {{Eq}.5.22} )\end{matrix}$

Here assume that δ₁ and δ₂ are sufficiently small (for instance,δ₁≲1/√{square root over (ln(n))} δ₂≲1/√{square root over (ln(n))} toensure that ∈″₁<½ and ∈″₂<½. Also, here c′ and c₀ are universalconstants (which are defined in the proof below).

Now let the correlation matrix C and measurement operator Φ_(D) befixed, and note that the noise term z is stochastic. With highprobability (over the random realization of z), equations (5.15) and(5.16) are be satisfied; here the failure probability is at most(e/n³)+(e/m³)+(2e/n^(1+2√{square root over (2)})).

Finally, the following simple upper bounds on ∈′″₁, ∈′″₂ and λ (thesefollow from the definitions of ∈′″₁ and ∈′″₂, the definitions of g andh, and the definition of Φ(·)):

∈′″₁≤3δ₁∥diag(C)∥₂₈,  (Eq. 5.23)

∈′″₂≤6δ₂ ∥C

₁,  (Eq. 5.24)

λ≤O(√{square root over (mln(n))}(δ₁∥diag(C)∥_(∞)√{square root over(mn)}+δ₂ ∥C

₁)).  (Eq. 5.25)

These bounds may be used to analyze the accuracy of the estimate ofmatrix C 400 (FIG. 6 ). Next, isotropy and incoherence of themeasurement operator is described. The rows of the measurement operatorΦ_(D) have two properties, isotropy and incoherence, which play afundamental role in compressed sensing. Let Q be the matrix (of size mby n(n−1)/2) that represents the action of Φ_(D) (using the fact thatthe subspace D is isomorphic to

^(n(n−1)/2)); that is, Q and Φ_(D) are related by the equation:Φ_(D)(C)=Q·uvec(C),∀C∈D. (Eq. 5.26)

The rows of Q are chosen independently at random, and each row has theform q=4uvec(rr^(T))∈

^(n(n−1)/2), (Eq. 5.27)

where r is sampled from the distribution described in (Eq. 3.2). q iscentered if it has mean

(q)=0, and q is isotropic if its covariance matrix is the identity:

(qq ^(T))=I.  (Eq. 5.28)

It is straightforward to check that q is centered and isotropic (up to anormalization

$\begin{matrix}{ {{factor}{of}2} ),{{since}:\{ {\begin{matrix}{{{\mathbb{E}}\lbrack {r_{i}r_{j}} \rbrack} = 0} & {i < j} \\{{{\mathbb{E}}\lbrack {r_{i}r_{j}r_{k}r_{l}} \rbrack} = 0} & {{i < {j{and}{}k}\  < {l{and}\ \{ {i,j} \}\cap\{ {k,l} \}}} = \varnothing} \\{{{\mathbb{E}}\lbrack {r_{i}r_{j}r_{k}r_{l}} \rbrack} = 0} & {{i < {j\ {and}\ k} < {l\ {and}\ {❘{\{ {i,j} \}\cap\{ {k,l} \}}❘}}} = 1} \\{{{\mathbb{E}}\lbrack {r_{i}r_{j}r_{k}r_{l}} \rbrack} = \frac{1}{4}} & {{i < {j\ {and}\ k} < {l\ {and}\ i}} = {{k\ {and}j}\  = l}}\end{matrix}.} }} & ( {{Eq}.5.29} )\end{matrix}$

(Note that in the last line of Eq. 5.29, a case with i=l and j=k, as therequirements of i<j and k<l leads to a contradiction.)

In addition, q is incoherent with parameter μ>0 if, with probability 1,all of its coordinates are small: ∥q∥_(∞) ²≤μ. (Eq. 5.30)

In order for this to be useful for compressed sensing, one needs μ to besmall, at most polylogarithmic in the dimension of q. For example, q isincoherent with parameter μ=16.

A non-universal recovery guarantee, using the “RIPless” framework isdescribed herein, which in turn relies on the isotropy and incoherenceproperties described above.

Let C∈

^(n×n) be a correlation matrix, and let C″∈D be its off-diagonal part(see equation (5.6)). C′ is approximately s-sparse, i.e., there exists amatrix C^((s))∈D that has at most s nonzero entries above the diagonal,and that approximates C′ in the (vector)

₁ norm, up to an error of size η_(s). This can be written compactly as:

∥uvec(C′−C ^((s)))∥₁≤η_(s),  (Eq. 5.31)

where uvec(·) was defined in equation (5.11). (Recall that both C′ andC^((s)) are symmetric, with all zeros on the diagonal. Hence it sufficesto consider those matrix entries that lie above the diagonal.)

Choose the measurement operator Φ at random (see equation (3.4)). Assumethat m (the dimension of h) satisfies the bound:

$\begin{matrix}{m \geq {{{{\overset{˜}{c}}_{0}( {1 + \beta} )} \cdot 4}s\log{\frac{n( {n - 1} )}{2}.}}} & ( {{Eq}.5.32} )\end{matrix}$

Here, {tilde over (c)}₀ is a universal constant, and β>0 is a parameterthat can be chosen freely by the experimenter. Note that m scaleslinearly with the sparsity s, but only logarithmically with thedimension n of the matrix C. This scaling is close to optimal, in aninformation-theoretic sense. This is one way of quantifying theadvantage of compressed sensing, when compared with measurements that donot exploit the sparsity of C.

Measure g≈diag(C) and h≈Φ(C), and calculate h′ (see equation (5.1)). Letδ₁ and δ₂ quantify the noise in the measurements of g and h, asdescribed in equations (5.18) and (5.19). Next, solve the LASSO problem,setting the regularization parameter λ according to (Eq. 5.20). LetW^((opt)) be the solution of this optimization problem.

Now the following recovery guarantee is noted, which shows thatW^((opt)) provides a good approximation to C′, in both the Frobenius (

₂) norm, and the vector

₁ norm. Theorem 1: For any correlation matrix C satisfying the sparsitycondition (5.31), with probability at least

$1 - \frac{12}{n( {n - 1} )} - {6e^{- \beta}}$

(over the random choice of the measurement operator Φ, with m setaccording to (Eq. 5.32)), the solution W^((opt)) is close to C′, with anerror that is bounded by:

$\begin{matrix}{{{{W^{({opt})} - C^{\prime}}}_{F} \leq {\sqrt{2}{{c( {1 + \alpha} )}\lbrack {\frac{\eta_{s}}{\sqrt{s}} + {\lambda\sqrt{s}}} \rbrack}}},} & ( {{Eq}.5.33} )\end{matrix}$ $\begin{matrix}{{{{W^{({opt})} - C^{\prime}}}_{\ell_{1}} \leq {2{{c( {1 + \alpha} )}\lbrack {\eta_{s} + {\lambda s}} \rbrack}}},} & ( {{Eq}.5.34} )\end{matrix}$

where c is a universal constant, and

${\alpha \leq {\log^{3/2}( \frac{n( {n - 1} )}{2} )}}.$

In these bounds, the first term upper-bounds the error that results fromapproximating C′ by a sparse matrix, and the second term upper-boundsthe error due to noise in the measurements of g and h.

To make the second term more transparent, the second term can becombined with the bound on λ from equation (5.25):

λ≤O(√{square root over (m ln(n))}(δ₁∥diag(C)∥_(∞)√{square root over(mn)}+δ₂ ∥C

₁)),  (Eq. 5.35)

where δ₁ and δ₂ quantify the noise in the measurements of g and h, asdescribed earlier. Also, it is useful to consider the special case whereC′ is exactly s-sparse, so η_(s)=0, and where as few measurementsettings as possible are used, by setting m∝s log n. In this case:

∥W ^((opt)) −C′∥ _(F) ≤O(log^(3/2)(n)λ√{square root over (s)})  (Eq.5.36)

≤O(√{square root over (s)} log^(3/2)(n)√{square root over (mlog(n))}·[δ₁∥diag(C)∥_(∞)√{square root over (mn)}+δ₂ ∥C

₁])  ((Eq. 5.37)

≤O(s log^(5/2)(n)[δ₁∥diag(C)∥_(∞)√{square root over (sn log(n))}+δ₂ ∥C

₁])  (Eq. 5.38)

≤O(s log^(5/2)(n)[δ₁∥diag(C)∥_(∞)√{square root over (snlog(n))}+δ₂√{square root over (n)}∥diag(C)∥₂+δ₂√{square root over(2s)}∥C′∥ _(F)]).  (Eq. 5.39)

This can be compared with the error bound (Eq. 2.31) for the naivemethod, and the RIP-based error bound (Eq. 5.48) for compressed sensing.Generally speaking, compressed sensing has an advantage over the naivemethod when s is small, and the RIPless bound is useful in the regimebetween m∝s log n and m∝s log⁴n, where the RIP-based bound does notapply. (When m is in the regime m∝s log⁴n or larger, the RIP-based boundapplies, and provides better results than the RIPless bound.)

Next, a universal recovery guarantee, using an older approach based onthe restricted isometry property (RIP) is described. This also relies onthe isotropy and incoherence properties shown above.

First, the number of measurement settings is set as: m≥c₀s log⁴n, (5.40)where s is the sparsity parameter, and co is some universal constant.(Note that m is slightly larger, by a poly(log n) factor, compared tothe RIPless case.) Also, recall that D is the subspace of symmetricmatrices whose diagonal elements are all 0 (see equation (5.10)), andΦ_(D) is the measurement operator restricted to act on this subspace.

In aspects, with high probability (over the random choice of Φ_(D)), thenormalized measurement operator Φ_(D)/√{square root over (m)} satisfiesthe RIP (for sparsity level 2s). To see this, recall the isotropy andincoherence properties shown above. These properties imply that themeasurement operator Φ_(D) is sampling at random from a “boundedorthonormal system.” Such operators are known to satisfy the RIP, via ahighly nontrivial proof.

From this point onwards, let the measurement operator Φ_(D) be fixed.Φ_(D) is capable of reconstructing the off-diagonal parts of all sparsematrices C, i.e., Φ_(D) can perform “universal” recovery.

As described above, let C∈

^(n×n) be a correlation matrix, and let C′∈D be its off-diagonal part(see equation (5.6)). Assume that C′ is approximately s-sparse, i.e.,there exists a matrix C^((s))∈D that has at most s nonzero entries abovethe diagonal, and that approximates C′ in the (vector)

₁ norm, up to an error of size η_(s). This can be written compactly as:∥uvec(C′−C^((s)))∥₁≤η_(s), (Eq. 5.41)

where uvec(·) was defined in equation (5.11). (Recall that both C′ andC^((s)) are symmetric, with all zeros on the diagonal. Hence it sufficesto consider those matrix entries that lie above the diagonal.)

Measure g≈diag(C) and h≈Φ(C), and calculate h′ (see equation (5.1)).Assume that the noise in the measurements of g and h is bounded by δ₁and δ₂, as described in equations (3.19) and (3.23). Then solve the

₁-minimization problem in (5.12) and (5.13), setting the parameters ∈₁and ∈₂ according to equations (3.21) and (3.25). Let W^((opt)) be thesolution of this problem.

The following recovery guarantee, which shows that W^((opt)) provides agood approximation to C′, in the Frobenius (

₂) norm, and in the

₁ vector norm. There is one subtle point: the convex optimizationproblem in equations (5.12) and (5.13) uses the unnormalized measurementoperator Φ_(D), while the error bounds apply to the normalizedmeasurement operator Φ_(D)/√{square root over (m)}. Hence, the noise issmaller by a factor of Vi in these error bounds.)

Theorem 2: With high probability (over the choice of the measurementoperator Φ, with m set according to (Eq. 5.40)), for all correlationmatrices C (satisfying the sparsity condition (Eq. 5.41)), the solutionW^((opt)) satisfies the following error bounds:

$\begin{matrix}{{{{W^{({opt})} - C^{\prime}}}_{F} \leq {{c_{1}\frac{\eta_{s}}{\sqrt{s}}} + {c_{2}( {\frac{\epsilon_{2}}{\sqrt{m}} + {\epsilon_{1}\sqrt{n}}} )}}},} & ( {{Eq}.5.42} )\end{matrix}$ $\begin{matrix}{{{{W^{({opt})} - C^{\prime}}}_{\ell_{1}} \leq {{c_{1}\eta_{s}} + {c_{2}\sqrt{s}( {\frac{\epsilon_{2}}{\sqrt{m}} + {\epsilon_{1}\sqrt{n}}} )}}},} & ( {{Eq}.5.43} )\end{matrix}$

where c₁ and c₂ are universal constants.

In these bounds, the first term upper-bounds the error that results fromapproximating C′ by a sparse matrix, and the second term upper-boundsthe error due to noise in the measurements of g and h.

In order to apply these bounds, one needs to know the values of ∈₁ and∈₂. These can be obtained from equations (3.22) and (3.26):

∈₁ ≤O(δ₁∥diag(C)∥₂),  (Eq. 5.44)

∈₂ ≤O(δ₂√{square root over (m)}∥C

₁).  (Eq. 5.45)

Also, it is useful to consider the special case where C′ is exactlys-sparse, so η_(s)=0, and where as few measurement settings as possibleare used, by setting m∝s log⁴n. In this case:

∥W ^((opt)) −C′∥ _(F)  (Eq. 5.46)

≤O(√{square root over (n)}δ₁∥diag(C)∥₂+δ₂ ∥C∥

₁)  (Eq. 5.47)

≤O(√{square root over (n)}δ₁∥diag(C)∥₂+δ₂(√{square root over(n)}∥diag(C)∥₂+√{square root over (2s)}∥C′∥ _(F))).  (Eq. 5.48)

This can be compared with the error bound (Eq. 2.31) for the naivemethod, and the RIPless error bound (Eq. 5.39) for compressed sensing.Generally speaking, compressed sensing has an advantage over the naivemethod when s is small, and the RIP-based bound has better scaling (as afunction of n, s, diag(C) and C′) than the RIPless bound, although itrequires m to be slightly larger.

The performance of the disclosed compressed sensing method, for atypical measurement scenario is described below. Both the accuracy ofthe method, and the experimental resources required to implement it areconsidered. The asymptotic scaling of the disclosed method is described,and compared to the naive method, direct estimation of the correlationmatrix.

Overall, the disclosed compressed sensing method has asymptoticallybetter sample complexity, whenever the off-diagonal part of thecorrelation matrix C is sufficiently sparse. In particular, for a systemof n qubits, the disclosed method is advantageous whenever the number ofcorrelated pairs of qubits, s, is at most O(n^(3/2)) (ignoring logfactors). These results are summarized in FIGS. 8A-8C.

Let C′ be the off-diagonal part of the correlation matrix C 400, thatis, C′ is the matrix whose off-diagonal elements match those of C, andwhose diagonal elements are zero. C′ has at most s nonzero elementsabove the diagonal (and, by symmetry, at most s nonzero elements belowthe diagonal). The goal is to estimate both C′ and diag(C).

Compressed sensing allows the possibility of adjusting the number ofmeasurement settings, m, over a range from m∝s log n to m∝n². (Note thatm∝s log n is just slightly above the information-theoretic lower bound,while m∝n² is the number of measurement settings used by the naivemethod.) Compressed sensing works across this whole range, but the errorbounds vary depending on m. There are two cases: (1) For m∝s log⁴n orlarger, both the RIP-based and RIPless error bounds are available, andthe RIP-based error bounds are asymptotically stronger. (2) For mbetween m∝s log n and m∝s log⁴n, only the RIPless error bound isavailable.

To make a fair comparison between compressed sensing and the naivemethod, the accuracy of these methods needs to be quantified in aconsistent way. This is a nontrivial task, because the error bounds forthe different methods have different dependencies on the parameters n,s, diag(C) and C′; this can be seen by comparing equation (2.31), andTheorems 1 and 2 as described herein.

A simple way of quantifying the accuracy of all of these methods may beused: given some δ>0, each method is required to return an estimate Ĉ′that satisfies

∥Ĉ′−C′∥ _(F) ≤δ∥C∥ _(F).  (Eq. 6.1)

The Frobenius matrix norm may be used, which is equivalent to the vector

₂ norm. C (rather than C′) is written on the right hand side of theinequality, in order to allow the recovery error to depend on both thediagonal and the off-diagonal elements of C.

In addition, each method returns an estimate g of diag(C) that satisfies

∥g−diag(C)∥₂≤δ∥diag(C)∥₂.  (Eq. 6.2)

For both compressed sensing as well as the naive method, g is obtainedin the same way, by performing single-qubit spectroscopy as in (2.19),and the error in g satisfies the same bound (Eq. 2.30).

The cost of implementing each method using real experiments should beaccounted for. This cost depends on a number of factors. One factor isthe total number of experiments that have to be performed, often calledthe sample complexity. This is the number of measurement settings, timesthe number of repetitions of the experiment with each measurementsetting. Another factor is the difficulty of performing a single run ofthe experiment. This involves both the difficulty of preparing entangledstates (random n-qubit GHZ states for the compressed sensing method, and2-qubit Bell states for the naive method), and the length of time thatone has to wait in order to observe dephasing.

An advanced quantum information processor is considered, where n-qubitGHZ states are fairly easy to prepare (using O(logn)-depth quantumcircuits), and dephasing occurs at low rates, so that the main cost ofrunning each experiment is the amount of time needed to observedephasing. In this scenario, it is reasonable to use the samplecomplexity as a rough measure of the total cost of implementing thecompressed sensing method, as well as the naive method.

In aspects, the sample complexity for three methods of interest may becalculated: (1) the naive method with m∝n², (2) compressed sensing withm∝s log⁴n, and (3) compressed sensing with m∝s log n. This method (2)outperforms the naive method whenever s≤O(n^(3/2)/log²n), and method (3)outperforms the naive method whenever s≤O(n^(2/3)/log²n).

In addition, for each of these methods, the number of samples wheresingle-qubit spectroscopy is performed, and the number of samples wheremulti-qubit spectroscopy is performed. (Recall that all of these methodsuse single-qubit spectroscopy to estimate the diagonal of C, and thenuse multi-qubit spectroscopy to estimate the off-diagonal part of C.)Both of these numbers can be important: multi-qubit spectroscopy is moreexpensive to implement on essentially all experimental platforms, andrequires more samples when s is large; but it is possible forsingle-qubit spectroscopy to dominate the overall sample complexity,when s is small.

A naive method with m∝n² is presented. Two parameters, δ₁ and δ₂, toquantify the accuracy of the measurements are used, as in equations(2.24) and (2.25). An estimate of C′ is provided, whose error is boundedby equation (2.31): with probability at least 1−η,

$\begin{matrix}{{{{\overset{\hat{}}{C}}^{\prime} - C^{\prime}}}_{F}^{2} \leq {{\frac{1}{\eta}\lbrack {{3( {n - 1} )( {\delta_{1} + \delta_{2}} )^{2}{{{diag}(C)}}_{2}^{2}} + {6\delta_{2}^{2}{C^{\prime}}_{F}^{2}}} \rbrack}.}} & ( {{Eq}.6.3} )\end{matrix}$

For simplicity, η is set to be some universal constant, for example,η=0.001. Now, given any δ>0:

$\begin{matrix}{\begin{matrix}{{{{\hat{C}}^{\prime} - C^{\prime}}}_{F}^{2} \leq {{\delta^{2}{{{diag}(C)}}_{2}^{2}} + {( {\delta^{2}/n} ){C^{\prime}}_{F}^{2}}}} \\{\leq {\delta^{2}{C}_{F}^{2}}}\end{matrix},} & ( {{Eq}.6.4} )\end{matrix}$

by setting δ₁=δ₂=O(δ/√{square root over (n)}). This satisfies therequirement (Eq. 6.1).

In addition, one can easily check that the estimate g for diag(C)satisfies the requirement (Eq. 6.2).

Then the sample complexity is as follows (see the discussion preceding(Eq. 2.24) and (Eq. 2.25)): the method performs single-qubitspectroscopy on O(n/δ₁ ²)=O(n²/δ²) samples, and multi-qubit spectroscopyon O(n²/δ₂ ²)=O(n³/δ²) samples. Hence the total sample complexity isO(n³/δ²).

In aspects, compressed sensing may be performed with m∝s log⁴n. Thesolution W^((opt)) for the

₁-minimization problem in (Eq. 5.12) and (Eq. 5.13) satisfies theRIP-based bound in Theorem 2. Two parameters, δ₁ and δ₂, are used toquantify the accuracy of the measurements, as in equations (3.19) and(3.23).

The estimator W^((opt)) satisfies the following error bound (see Theorem2, and equation (5.48)):

∥W ^((opt)) −C′∥ _(F) ≤O(√{square root over (n)}δ₁∥diag(C)∥₂+δ₂(√{squareroot over (n)}∥diag(C)∥₂+√{square root over (2s)}∥C′∥ _(F))).  (Eq. 6.5)

Now, given any δ>0:

$\begin{matrix}{\begin{matrix}{{{{W^{({opt})}} - C^{\prime}}}_{F} \leq {{\frac{1}{\sqrt{2}}\delta{{{diag}(C)}}_{2}} + {\frac{1}{\sqrt{2}}\delta{C^{\prime}}_{F}}}} \\{\leq {\delta{C}_{F}}}\end{matrix},} & ( {{Eq}.6.6} )\end{matrix}$

by setting δ₁=O(δ/√{square root over (n)}) and δ₂=O(δ/√{square root over(max(n,s))}). (Eq. 6.7)

This satisfies the requirement (Eq. 6.1). In addition, one can easilycheck that the estimate g for diag(C) satisfies the requirement (Eq.6.2). Then the sample complexity is as follows (see the discussionfollowing (Eq. 3.19) and (Eq. 3.23)): the method performs single-qubitspectroscopy on O(n/q)=O(n²/δ²) samples, and multi-qubit spectroscopy on

O(m/δ₂ ²)≤O(smax(n,s)log⁴(n)/δ²) (Eq. 6.8) samples. Hence the totalsample complexity is at most O(max(n,s)² log⁴(n)/δ²). (Eq. 6.9)

This is less than the sample complexity of the naive method, providedthe off-diagonal part of the correlation matrix is sufficiently sparse,i.e., when s≤O(n^(3/2)/log²n).

In aspects, compressed sensing may be performed with m∝s log n. TheLASSO optimization problem whose solution W^((opt)) satisfies theRIPless bound in Theorem 1. Two parameters, δ₁ and δ₂, are used toquantify the accuracy of the measurements, as in equations (3.19) and(3.23). In the following, the diagonal elements of C satisfy a bound ofthe form

$\begin{matrix}{{{{diag}(C)}}_{\infty} \leq {0{( {\frac{1}{\sqrt{n}}{{{diag}(C)}}_{2}} ).}}} & ( {{Eq}.6.1} )\end{matrix}$

This assumption may be used to get a stronger error bound for W^((opt)).The assumption (Eq. 6.10) indicates that none of the diagonal elementsc_(jj) is too much larger than the others. This is plausible for aquantum system 100 that consists of many qubits that are constructed ina similar way.

In order to make this intuition more precise, (Eq. 6.10) may be writtenin an equivalent form:

$\begin{matrix}{{{\underset{1 \leq j \leq n}{\max}( c_{jj}^{2} )} \leq {0( {\frac{1}{n}{\sum_{j = 1}^{n}c_{jj}^{2}}} )}},} & ( {{Eq}.6.11} )\end{matrix}$

which indicates that the largest c_(jj) ² is at most a constant factorlarger than the average of all of the c_(jj) ². Also, it is informativeto consider how (Eq. 6.10) and (Eq. 6.11) compare to the (arguably morenatural) assumption that

$\begin{matrix}{{\max\limits_{1 \leq j \leq n}{❘c_{jj}❘}} \leq {0{( {\frac{1}{n}{\sum_{j = 1}^{n}{❘c_{jj}❘}}} ).}}} & ( {{Eq}.6.12} )\end{matrix}$

In fact, (Eq. 6.12) is actually a stronger assumption, in the sense thatit implies (Eq. 6.10) and (Eq. 6.11), via the Cauchy-Schwarz inequality.The estimator W^((opt)) satisfies an error bound given by Theorem 1, andequation (5.39). Combining this with the assumption (Eq. 6.10):

∥W ^((opt)) −C′∥ _(F) ≤O(s log^(5/2)(n)[δ₁√{square root over (slog(n))}∥diag(C)∥₂+δ₂√{square root over (n)}∥diag(C)∥₂+δ₂√{square rootover (2 s)}∥C′∥ _(F)]).  (Eq. 6.13)

Now, given any S>0, ensure that

$\begin{matrix}{\begin{matrix}{{{W^{({opt})} - C^{\prime}}}_{F} \leq {{\frac{1}{\sqrt{2}}\delta{{{diag}(C)}}_{2}} + {\frac{1}{\sqrt{2}}\delta{C^{\prime}}_{F}}}} \\{\leq {\delta^{2}{C}_{F}^{2}}}\end{matrix},} & ( {{Eq}.6.14} )\end{matrix}$ $\begin{matrix}{{{{by}{setting}{}\delta_{1}} = {O( \frac{\delta}{s^{3/2}\log^{3}(n)} )}},} & ( {{Eq}.6.15} )\end{matrix}$ $\begin{matrix}{{{and}\delta_{2}} = {{O( \frac{\delta}{s{\log^{5/2}(n)}\sqrt{\max( {n,s} )}} )}.}} & ( {{Eq}.6.16} )\end{matrix}$

This satisfies the requirement (Eq. 6.1).

In addition, one can easily check that the estimate g for diag(C)satisfies the requirement (Eq. 6.2). Then the sample complexity is asfollows (see the discussion following (3.19) and (3.23)). The disclosedmethod performs single-qubit spectroscopy on O(n/δ₁ ²)=O(ns³ log⁶(n)/δ²)(Eq. 6.17)

In aspects, the disclosed method further includes performingsingle-qubit spectroscopy samples, and multi-qubit spectroscopy onO(m/δ₂ ²)≤O(s³ max(n,s)log⁶(n)/δ²) (Eq. 6.18) samples. Hence the totalsample complexity is at most O(s³ max(n,s)log⁶(n)/δ²). (Eq. 6.19)

The total sample complexity is less than the sample complexity of thenaive method, provided the off-diagonal part of the correlation matrixis sufficiently sparse, i.e., when s≤O(n^(2/3)/log²n)

Referring to FIGS. 9A and 9B, an evolution time t is chosen. During thephysical implementation of the measurements of the correlation matrix C,decay rates Γ_(ab) may be estimated. To do this, quantum states |ψ_(ab)

are prepared, allowing the quantum states to evolve for some time t, andthen measure the quantum states in an appropriate basis. This works wellwhen t is chosen appropriately, so that Γ_(ab)t˜1.

For example, one method of estimating the decay rate may use anevolution time t such that

$\frac{1}{2} \leq {\Gamma_{ab}t} \leq 2.$

An initial guess for t (τ₀) is made. Next, a “binary search” isperformed. The binary search includes running a sequence of experiments,where one observes the dephasing of the state |ψ_(ab)

for some time t, and after each experiment, one adjusts the time tadaptively, multiplying and dividing by factors of 2, in order to getthe “right amount” of dephasing. Generally, the sequence of experimentsis about ˜|log(Γ_(ab)τ₀)| experiments. More precisely, the followingprocedure may be performed:

1. Fix some τ₀>0; as the initial guess for the evolution time t.

2. For r=1, 2, . . . , N_(trials), perform the following: (N_(trials)are set according to equation (7.9) below)

(a) Set s₀=0 and t₀=2^(s) ⁰ τ₀ (as the initial guess for t).

(b) For j=0,1,2, . . . , N_(steps)−1, do the following: (N_(steps) areset according to equation (7.5) below)

i. Prepare the state

$  {{{ { {❘\psi_{ab}} \rangle = {\frac{1}{\sqrt{2}}( {❘a} }} \rangle +}❘}b} \rangle ),$

allow the state to dephase for time then measure in the basis

$  {{{ {\frac{1}{\sqrt{2}}( {❘a} } \rangle \pm}❘}b} \rangle )$

ii. If the measurement returns

$  {{{ {\frac{1}{\sqrt{2}}( {❘a} } \rangle +}❘}b} \rangle ),$

then set

$\begin{matrix}{s_{j + 1} = \{ \begin{matrix}{{s_{j + 1}{with}{probability}\frac{e - 1}{e + 1}},} \\{s_{j}{{otherwise}.}}\end{matrix} } & ( {{Eq}.7.1} )\end{matrix}$

If the measurement returns

$  {{{ {\frac{1}{\sqrt{2}}( {❘a} } \rangle -}❘}b} \rangle ),$

then set s_(j+1)=s_(j)−1.

iii. Set t_(j+1)=2^(s) ^(j+) τ₀. (This is the next guess for t.)

(c) Define τ_(r) to be the value of s_(j) from the last iteration of theloop, that is, ξ_(r)=s_(N) _(steps) .

3. Compute the average

$\xi = {\frac{1}{N_{trials}}{\sum_{r = 1}^{N_{trials}}{\xi_{r}.}}}$

Return {circumflex over (t)}=2⁸⁶ τ₀. (This is the estimate for t.)

This procedure can be described intuitively as follows. The inner loopof this procedure (the loop indexed by j) can be viewed as a kind ofstochastic gradient descent, which behaves like a random walk on realnumbers of the form t=2^(s)τ₀ (s∈

) (see the dashed curves in FIG. 9A).

In aspects, this random walk has a single basin of attraction at a pointt*=2^(s)*τ₀ that satisfies Γ_(ab)t*≈1, that is, s*≈−log₂(Γ_(ab)τ₀) Therandom walk converges to this point: with high probability, the sequences₀, s₁, s₂, . . . will reach the point s* afterO(|s*|)=O(|log(Γ_(ab)τ₀)|) steps; after that point, the sequence willremain concentrated around s*, with exponentially-decaying tailprobabilities (see FIG. 9B). This claim is made precise in equations(7.5) and (7.6).

Finally, the outer loop of this procedure (the loop indexed by r)computes an estimate ξ of s*, by averaging over several independenttrials (see the solid curves in FIG. 9A). This then yields an estimate{circumflex over (t)} of t*. The required number of trials, and theaccuracy of the resulting estimate {circumflex over (t)}, are analyzedin equations (7.9) and (7.10).

Referring to FIGS. 9A and 9B, plots illustrating the evolution time tare shown. FIG. 9A shows the trajectories of the random walk. Start withan initial guess τ₀, that can either be shorter or longer than theoptimal time. The evolution time is then stochastically halved ordoubled over N_(steps) iterations, according to the algorithm describedherein. This procedure is repeated (dashed curves) and the outcome isaveraged (solid curves) to obtain an estimate {circumflex over (t)}. Theregion in which

$\frac{1}{2} < {\Gamma_{ab}\hat{t}} < 2$

is shaded in green. FIG. 9B illustrates the accuracy of the finalestimate t, as a function of the number of steps N_(steps) and thestarting point τ₀. The green shading shows the region where {circumflexover (t)} satisfies the bound

$\frac{1}{2} < {\Gamma_{ab}\hat{t}} < 2.$

At the boundary of the green region, N_(steps) scales logarithmicallywith |Γ_(ab)τ₀|, as predicted by Eq. (7.5).

To choose t the random walk may be used. The variables s_(j), which arerelated to the t_(j) via the identity t_(j)=2^(s) ^(j) τ₀. It is easy tosee that s₀=0, s_(j+1)∈{s_(j),s_(j)−1,s_(j)+1}, and

$\begin{matrix}\begin{matrix}{{{\mathbb{E}}\lbrack {s_{j + 1}{❘s_{j}}} \rbrack} = {s_{j} + {\frac{1}{2}( {1 + e^{{- \Gamma_{ab}}t_{j}}} )\frac{e - 1}{e + 1}} - {\frac{1}{2}( {1 - e^{{- \Gamma_{ab}}t_{j}}} )}}} \\{= {s_{j} + {\frac{1}{e + 1}{( {e^{1 - {\Gamma_{ab}t_{j}}} - 1} ).}}}}\end{matrix} & ( {{Eq}.7.2} )\end{matrix}$

Hence the sequence s₀, s₁, s₂, . . . can be viewed as the trajectory ofa random walk on a 1-dimensional chain, beginning at s₀, with transitionprobabilities that vary along the chain. The expected behavior of therandom walk can be bounded as follows:

$\begin{matrix}{{{\mathbb{E}}\lbrack {s_{j + 1}{❘s_{j}}} \rbrack}\{ \begin{matrix}{{\geq {s_{j} + {\mu{when}\Gamma_{ab}t_{j}}} \leq \frac{1}{\sqrt{2}}},} \\{{\leq {s_{j} - {\mu{when}\Gamma_{ab}t_{j}}} \geq \sqrt{2}},} \\{{\approx {s_{j} + {\frac{e - 1}{e + 1}{when}\Gamma_{ab}t_{j}}} \ll 1},} \\{{\approx {s_{j} - {\frac{1}{e + 1}{when}{}\Gamma_{ab}t_{j}}} \gg 1},}\end{matrix} } & ( {{Eq}.7.3} )\end{matrix}$

where μ is a numerical constant,

μ=0.09.  (Eq. 7.4)

Hence the random walk will tend to converge towards some integer s* suchthat t*=2^(s)*τ₀ satisfies

$\frac{1}{\sqrt{2}} \leq {\Gamma_{ab}t^{*}} \leq {\sqrt{2}.}$

Note that the expected position of the random walk moves towards s* at arate that is lower-bounded by μ. So, in order to go from s₀=0 tos*≈−log₂(Γ_(ab)τ₀), the random walk will take roughly

${\frac{1}{\mu}{❘s^{*}❘}} = {\frac{1}{\mu}{❘{\log_{2}( {\Gamma_{ab}\tau_{0}} )}❘}}$

steps. It is easy to see that the stationary distribution of the randomwalk is centered around s*, with exponentially decaying tails; hence,once the walk reaches s*, it will remain concentrated around that point.

Set the parameter N_(steps) so that, with high probability, afterN_(steps) steps, the random walk will converge. Given an upper bound hon the magnitude of s*, i.e., |s*|≤h, or equivalently2^(−h)≤Γ_(ab)τ₀≤2^(h). The random walk may be performed for a number ofsteps

$\begin{matrix}{{N_{steps} = {\frac{h}{\mu} + \eta}},} & ( {{Eq}.7.5} )\end{matrix}$

where η≥0. Here,

$\frac{h}{\mu}$

is (an upper bound on) the expected number of steps needed to reach s*.Take an additional η steps to ensure that the walk does indeed reach s*with high probability (the probability of failure decreasesexponentially with η).

After N_(steps) steps, the final position of the walk is close to s*,with exponentially decaying tail probabilities: for any

≥1,

$\begin{matrix}{ {{{{\Pr\lbrack {{❘{s_{N_{steps}} - s^{*}}❘} \geq \ell} }❘}s_{0}} = 0} \rbrack \leq {{\frac{16}{\mu^{2}}{\exp( {{{- \frac{\mu( {\mu + 1} )}{8}}\ell} + \frac{\mu^{2}}{4}} )}} + {2{{\exp( {{- \frac{\mu}{16}}\min\{ {\frac{\mu\eta}{h},1} \}( {\ell + {\mu\eta}} )} )}.}}}} & ( {{Eq}.7.6} )\end{matrix}$

In particular, when

${\eta \geq \frac{h}{\mu}},$

this bound can be slightly simplified:

$\begin{matrix}{{\Pr\lbrack \ { {{❘{s_{N_{steps}} - s^{*}}❘} \geq \ell} \middle| s_{0}  = 0} \rbrack} \leq {{\frac{16}{\mu^{2}}{\exp( {{{- \frac{\mu( {\mu + 1} )}{8}}\ell} + \frac{\mu^{2}}{4}} )}} + {2{{\exp( {{- \frac{\mu}{16}}( {\ell + {\mu\eta}} )} )}.}}}} & ( {{Eq}.7.7} )\end{matrix}$

The bound (Eq. 7.6) is proved using martingale techniques.

In aspects, the parameter N_(trials) may be set, and an error bound onthe estimator {circumflex over (t)} may be derived. The bound (Eq. 7.6)implies that the random variables ξ_(r)=s_(N) _(steps) aresub-exponential, and their sub-exponential norms are bounded by someconstant ∥ξ_(r)∥_(ψ) ₁ ≤K. Hence their average ξ satisfies aBernstein-type concentration inequality: for every δ≥0,

$\begin{matrix}{{{\Pr\lbrack {{❘{\xi - s^{*}}❘} \geq \delta} \rbrack} \leq {2{\exp( {{- c}\min\{ {\frac{\delta^{2}}{K^{2}},\frac{\delta}{K}} \} N_{trials}} )}}},} & ( {{Eq}.7.8} )\end{matrix}$

where c>0 is a universal constant.

Now, for any ∈>0, set

$\begin{matrix}{N_{trials} = {\frac{1}{c}\max\{ {\frac{K}{\delta},\frac{K^{2}}{\delta^{2}}} \}{{\log( \frac{2}{\epsilon} )}.}}} & ( {{Eq}.7.9} )\end{matrix}$

The following error bound on the estimator {circumflex over(t)}=2^(ξ)τ₀: with probability ≥1−∈, provides |ξ−s*|<δ, which impliesthat

2^(−0.5−δ)≤Γ_(ab)√{square root over (t)}<2 ^(0.5αδ),  (Eq. 7.10)

Assuming δ<½, this implies that

${\frac{1}{2} < {\Gamma_{ab}\overset{\hat{}}{t}} < 2},$

as desired.

When the disclosed measurement protocols are implemented in anexperiment, errors may occur during state preparation and measurement(SPAM errors). The effect of these errors on estimating the decay ratesΓ_(ab) may be investigated. Let ρ₀ and E₀ denote the noiseless initialstate and observable of interest, respectively.

$\begin{matrix}{ {{ {{{ {{{ {{{ {\rho_{0} = {E_{0} = {\frac{1}{2}( {❘a} }}} \rangle\langle a }❘} + {❘b}} \rangle\langle b }❘} + {❘b}} \rangle\langle a }❘} + {❘a}} \rangle\langle b }❘} ).} & (8.1)\end{matrix}$

Error channels ε_(s) and ε_(m) that act on state preparation andmeasurement operations may be considered as:

{tilde over (ρ)}=ε_(s)(ρ₀)=ρ₀+δρ  (Eq. 8.2)

{tilde over (E)}=ε _(m)(E ₀)=E ₀ +δE,  (Eq. 8.3)

where ∥δρ∥_(tr)≤∈_(s) and ∥δE∥≤∈_(m), and ∈_(s) and ∈_(m) are smallparameters. The outcome of the protocol is now given by

{tilde over (P)} _(ab)(t)=Tr[{tilde over (E)}ε _(t)({tilde over(ρ)})],  (Eq. 8.4)

where ∈_(t)=exp(

t) is the evolution under the correlated dephasing noise (1.1).

The protocol is robust against these kinds of errors, and for shorttimes t the decay of {tilde over (P)}ab(t) is still dominated by Γ_(ab).Using Eqs. (8.2) and (8.3):

Tr[{tilde over (E)}ε _(t)({tilde over (ρ)})]=Tr[E ₀ε_(t)(ρ₀)]  (Eq. 8.5)

+Tr[E ₀ε_(t)(δρ)]+Tr[δEε _(t)(ρ₀)]

+Tr[δEε _(t)(δρ)]

The first term is the outcome without errors:

$\begin{matrix}{{{Tr}\lbrack {E_{0}{\mathcal{E}_{t}( \rho_{0} )}} \rbrack} = {\frac{1 + e^{{- t}\Gamma_{ab}}}{2}.}} & ( {{Eq}.8.6} )\end{matrix}$

Find the effect of errors on the second and third terms, by consideringthe effect of ε_(t) on ρ₁ and E₁. Specifically:

Tr[E ₀ε_(t)(δρ)]=η_(s)+ζ_(s) e ^(−Γ) ^(ab) ^(t),  (Eq. 8.7)

Tr[δEε _(t)(ρ₀)]=η_(m)+ζ_(m) e ^(−Γ) ^(ab) ,  (Eq. 8.8)

where η_(s,m) and ζ_(s,m) are constants that are determined by δρ andδE, for s and m, respectively. Therefore, these terms decay at the samerate as the first case and do not affect the exponential decay. However,the last term can, in principle, contain different decay rates and cancause deviation from a single exponential decay. Bound the rate at whichR(t)=Tr[δEε_(t)(δρ)] grows:

$\begin{matrix}{{❘{\overset{˙}{R}(t)}❘} = {❘{\frac{\partial}{\partial t}{{Tr}\lbrack {\delta E{\mathcal{E}_{c}( {\delta\rho} )}} \rbrack}}❘}\ } & ( {{Eq}.8.9} )\end{matrix}$ $\begin{matrix}{{= {{❘{{Tr}\lbrack {\delta E{\mathcal{L}({\delta\rho})}} \rbrack}❘} \leq {2\epsilon_{m}{\epsilon_{s}( {n + s} )}}}},} & ( {{Eq}.8.1} )\end{matrix}$

Therefore:

$\begin{matrix}{{{{\overset{˜}{P}}_{ab}(t)} = {{\frac{1}{2}( {1 + \eta_{s} + \eta_{m} + {( {1 + \zeta_{s} + \zeta_{m}} )e^{{- \Gamma_{ab}}t}}} )} + {R(t)}}},} & ( {{Eq}.8.12} )\end{matrix}$

The deviations from a single exponential decay are attributed to R(t).Using Eqs. (8.11) and (8.12) the decay rate of {tilde over (P)}_(ab)(t)is dominated by Γ_(ab) for evolution times t≲1/(2∈_(s)∈_(m)(n+s))⁻¹.

Referring to FIGS. 10A-10C, graphs illustrating the exponential decay of{tilde over (P)}_(ab)(t) when there are SPAM errors is shown. The decayof a 3-qubit GHZ state is simulated. A correlation matrix C is selectedwith uniform single-qubit dephasing rate (c_(ii)=γ₀) andnearest-neighbor correlations c_(i,i±1)=γ₀/4. FIGS. 10A-10C show thedecay of {tilde over (P)}_(ab)(t) under different noise strengths: (FIG.10A) Δ=0.01, (FIG. 10B) Δ=0.05, (FIG. 10C) Δ=0.1. The dashed lines showthe decay with no SPAM errors, and solid lines show the decay with SPAMerrors from randomly-sampled error channels. The solid lines resemblethe dashed lines for short evolution times t.

Numerical simulations may be used to investigate the effect of SPAMerrors on estimates of the decay rate Γ_(ab). Simulate SPAM errors byapplying random error channels ε_(s) and ε_(m), whose strengths arecontrolled by a parameter Δ. Compare the decay of {tilde over(P)}_(ab)(t) with and without SPAM errors, for different values of Δ.Observe that, for short times t, the decay with SPAM errors matches thedecay without SPAM errors, i.e., the decay rate is dominated by Γ_(ab),see FIGS. 10A-10C. This is consistent with the theoretical analysis.

The disclosed method for learning sparse correlated dephasing noise(FIGS. 3 and 4 ) can be extended to the most general case of the masterequation (1.1), where the matrix C is complex, and there is anadditional Hamiltonian term H_(s).

In aspects, the decay rates may be complex. The complete dynamicsimposed by the environment on the quantum system 100 can have a coherentevolution in addition to the decay. The evolution of the quantum system

${100\frac{d\rho}{dt}} = {\mathcal{L}(\rho)}$

is in general given by the Lindblad generator

$\begin{matrix}{{{\mathcal{L}(\rho)} = {{- {i\lbrack {\rho,H_{s}} \rbrack}} + {\sum_{l,m}{c_{lm}( {{Z_{l}\rho Z_{m}} - {\frac{1}{2}\{ {{Z_{l}Z_{m}},\rho} \}}} )}}}},} & ( {{Eq}.9.1} )\end{matrix}$ $\begin{matrix}{{{where}H_{s}} = {\sum_{l,m}{r_{lm}Z_{l}Z_{m}}}} & ( {{Eq}.9.2} )\end{matrix}$

is sometimes called the (generalized) Lamb shift Hamiltonian, andC=(c_(lm)) is now a complex matrix. Matrix C may be decomposed asC=V+iT, (Eq. 9.3)

where the real and imaginary parts are separated into V=(v_(lm)), a realsymmetric matrix, and T=(t_(lm)), a real skew-symmetric matrix,respectively. Moreover, the Lamb shift Hamiltonian can be encoded in thesymmetric matrix R=(r_(lm)).

In aspects, the operator

acts on the off-diagonal matrix elements |a

b|. This involves “decay rates” that depend on R, V and T in a simpleway, although these “decay rates” are now complex rather than real.Specifically,

(|a

b|)=(−Γ_(ab) +iΩ _(ab))|a

b|,  (Eq. 9.4)

where Γ_(ab) and Ω_(ab) are real numbers that capture the decay andoscillations of the matrix element, respectively. The convention isdefined for states |a

and |b

: Z_(i)|a

=α_(i)|a

, and similarly for |b

and β_(i). Therefore, separating the real and imaginary parts of

(|a

b|) it is found that:

$\begin{matrix}{{\Gamma_{ab} = {{- {\sum_{l,m}{v_{lm}( {{\alpha_{l}\beta_{m}} - {\frac{1}{2}\alpha_{l}\alpha_{m}} - {\frac{1}{2}\beta_{l}\beta_{m}}} )}}} = {\frac{1}{2}( {\alpha - \beta} )^{T}{V( {\alpha - \beta} )}}}},} & (9.5)\end{matrix}$

The fact that V is symmetric can be used. Similarly:

$\begin{matrix}{\Omega_{ab} = {{{- {\sum_{l,m}{r_{lm}( {{\alpha_{l}\alpha_{m}} - {\beta_{l}\beta_{m}}} )}}} + {\sum_{l,m}{v_{lm}( {{\alpha_{l}\beta_{m}} - {\frac{1}{2}\alpha_{l}\alpha_{m}} - {\frac{1}{2}\beta_{l}\beta_{m}}} )}}} = {{- ( {{\alpha^{T}R\alpha} - {\beta^{T}R\beta}} )} + {\frac{1}{2}{( {{\alpha^{T}T\beta} - {\beta^{T}T\alpha}} ).}}}}} & ( {{Eq}.9.6} )\end{matrix}$

R and T, are symmetric and skew-symmetric matrices, respectively.Therefore, the coherences, that is, the |a

b| elements of the density matrix ρ(t), exhibit both oscillations (at afrequency nab) and exponential decay (at a rate Γ_(ab)) as a function oft. Note that it is possible to measure both Γ_(ab) and Ω_(ab), byestimating the matrix elements of ρ(t) at different times t, using thesame types of Ramsey experiments described herein. In particular, onecan extract these parameters using standard techniques for analyzingspectral lines in atomic physics. Here, the squared magnitude of theFourier transform of the measurement time series is a Lorentzianfunction, the center of the Lorentzian peak provides the oscillationfrequency, and the width of the peak provides the decay rate.

Given estimates of Γ_(ab) and Ω_(ab), the compressed sensing method canbe extended to recover both the correlation matrix C 400 (FIG. 6 ) andthe Hamiltonian H_(s). As before, m˜s log n or m˜s log⁴n can be used formeasurement settings. For each measurement setting, a and b are chosenuniformly at random in {0,1}11. From estimates of Γ_(ab), V can berecovered, the real part of C, exactly as before. In a similar manner,given estimates of Ω_(ab), T may be recovered, the imaginary part of C,as well as R, the matrix that encodes the Hamiltonian H_(s). This ispossible, because the equation represents a measurement of R and T thathas the needed isotropy and incoherence properties.

In aspects, isotropy and incoherence of the Measurements Ω_(ab) may bedetermined.

Ω_(ab), viewed as a measurement of R and T, with a and b chosenuniformly at random in {0,1}^(n). This random measurement has the sameisotropy and incoherence properties as before, and hence the compressedsensing method will still succeed using these measurements. Theincoherence property is easy to see, but some work is required to showthe isotropy property.

The measurement Ω_(ab) acts on T and R as

$\begin{matrix}{\Omega_{ab} = {{{\begin{bmatrix}\alpha^{T} & \beta^{T}\end{bmatrix}\begin{bmatrix}{- R} & {\frac{1}{2}T} \\{{- \frac{1}{2}}T} & R\end{bmatrix}}\begin{bmatrix}\alpha \\\beta\end{bmatrix}}.}} & ( {{Eq}.9.7} )\end{matrix}$

It is advantageous to consider the effect of measurements if thesymmetries of R and T are enforced. Note that T and R are realskew-symmetric and symmetric matrices, respectively. Moreover, they areboth traceless. Therefore:

Ω_(ab)Σ_(i<j)(α_(i)β_(j)−β_(i)α_(j))T_(ij)+2Σ_(i<j)(α_(i)α_(j)−β_(i)β_(j))R _(ij))  (Eq. 9.8)

As before, let uvec be the linear operator that returns theupper-triangular part of a matrix (not including the diagonal elements),that is, uvec: R

(R_(ij))_(i<j). (Eq. 9.9)

Then Ω_(ab) can be expressed as

$\begin{matrix}{{\Omega_{ab} = {q\begin{bmatrix}{uve{c(T)}} \\{2{uve}{c(R)}}\end{bmatrix}}},} & ( {{Eq}.9.1} )\end{matrix}$

similar to Eq. (5.26), where q is a row vector of the form

q=[uvec(αβ^(T)−βα^(T)),uvec(αα^(T)−ββ^(T))].  (Eq. 9.11)

Note that as described above, α and β are chosen uniformly andindependently at random from {1, −1}^(n). It is then straightforward tosee that q in this case satisfies the incoherence property (Eq. 5.30)with μ=1. Furthermore, one can check that q is centered and isotropic(up to a normalization factor of √{square root over (2)}), since:

$\begin{matrix}\{ {\begin{matrix}{{{\mathbb{E}}\lbrack {\alpha_{i}\alpha_{j}} \rbrack} = {{{\mathbb{E}}\lbrack {\alpha_{i}\beta_{j}} \rbrack} = {{{\mathbb{E}}\lbrack {\beta_{i}\beta_{j}} \rbrack} = 0}}} \\{{{\mathbb{E}}\lbrack {( {{\alpha_{i}\beta_{j}} - {\beta_{i}\alpha_{j}}} )( {{\alpha_{k}\beta_{l}} - {\beta_{k}\alpha_{l}}} )} \rbrack} = {2\delta_{ik}\delta_{jl}}} \\{{{\mathbb{E}}\lbrack {( {{\alpha_{i}\alpha_{j}} - {\beta_{i}\beta_{j}}} )( {{\alpha_{k}\alpha_{l}} - {\beta_{k}\beta_{l}}} )} \rbrack} = {2\delta_{ik}\delta_{jl}}} \\{{{\mathbb{E}}\lbrack {( {{\alpha_{i}\beta_{j}} - {\beta_{i}\alpha_{j}}} )( {{\alpha_{k}\alpha_{l}} - {\beta_{k}\beta_{l}}} )} \rbrack} = 0}\end{matrix},}  & (9.12)\end{matrix}$

where it is assumed that i<j and k<l in all cases. The second and thethird lines in the above equation capture the correlations in themeasurements of T and R, respectively, and the last line captures thecross-correlations between the two measurements.

FIG. 11 illustrates a flow diagram of the computer-implemented method1100 for determining a two-qubit correlated dephasing error for thequantum system 100, such as the quantum system of FIG. 1 . It iscontemplated that the performance of the order of the steps of FIG. 11may be changed. The steps of method 1100 may be performed by aprocessor.

Initially, at step 1102, a signal of a quantum system 100 is accessed.The signal includes a plurality of qubits 102 (FIG. 1 ). Every qubit hasa nonzero dephasing rate. The signal includes information regarding amatrix (e.g., matrix C 400). The matrix includes diagonal elements andoff-diagonal elements (FIG. 6 ). The off-diagonal elements of the matrixare 2s-sparse.

Next, at step 1104, randomized measurements of the off-diagonal elementsof the matrix are performed. The randomized measurements may includepreparing entangled GHZ states on random subsets of the plurality ofqubits 102. The randomized measurements may be based on noisespectroscopy and quantum sensing.

Next, at step 1106, the matrix is recovered based on a directmeasurement of the diagonal elements of the matrix. The recovered matrixmay include a restricted isometry property.

In aspects, a two-qubit correlated dephasing error may be estimatedbased on the recovered matrix. In another aspect, a long-rangecorrelated dephasing error may be detected based on the recoveredmatrix.

In aspects, a decay rate may be collected in a vector. The recoveredmatrix may be further based on the estimated decay rate. A relaxationtime and/or decoherence time of the quantum system 100 may be estimatedbased on the estimated decay rate.

Certain embodiments of the present disclosure may include some, all, ornone of the above advantages and/or one or more other advantages readilyapparent to those skilled in the art from the drawings, descriptions,and claims included herein. Moreover, while specific advantages havebeen enumerated above, the various embodiments of the present disclosuremay include all, some, or none of the enumerated advantages and/or otheradvantages not specifically enumerated above.

The embodiments disclosed herein are examples of the disclosure and maybe embodied in various forms. For instance, although certain embodimentsherein are described as separate embodiments, each of the embodimentsherein may be combined with one or more of the other embodiments herein.Specific structural and functional details disclosed herein are not tobe interpreted as limiting, but as a basis for the claims and as arepresentative basis for teaching one skilled in the art to variouslyemploy the present disclosure in virtually any appropriately detailedstructure. Like reference numerals may refer to similar or identicalelements throughout the description of the figures.

The phrases “in an embodiment,” “in embodiments,” “in variousembodiments,” “in some embodiments,” or “in other embodiments” may eachrefer to one or more of the same or different example embodimentsprovided in the present disclosure. A phrase in the form “A or B” means“(A), (B), or (A and B).” A phrase in the form “at least one of A, B, orC” means “(A); (B); (C); (A and B); (A and C); (B and C); or (A, B, andC).”

It should be understood that the foregoing description is onlyillustrative of the present disclosure. Various alternatives andmodifications can be devised by those skilled in the art withoutdeparting from the disclosure. Accordingly, the present disclosure isintended to embrace all such alternatives, modifications, and variances.The embodiments described with reference to the attached drawing figuresare presented only to demonstrate certain examples of the disclosure.Other elements, steps, methods, and techniques that are insubstantiallydifferent from those described above and/or in the appended claims arealso intended to be within the scope of the disclosure.

What is claimed is:
 1. A computer-implemented method for detecting atwo-qubit correlated dephasing error, the method comprising: accessing asignal of a quantum system, the quantum system includes a plurality ofqubits, wherein every qubit has a nonzero rate of dephasing, whereinsome qubits have a nonzero rate of correlated dephasing, wherein thesignal includes information about a matrix, the matrix includingdiagonal elements and off-diagonal elements, and wherein theoff-diagonal elements of the matrix are 2s-sparse; performing randomizedmeasurements of the off-diagonal elements of the matrix; and recoveringthe matrix based on a direct measurement of the diagonal elements of thematrix.
 2. The computer-implemented method of claim 1, furthercomprising estimating a two-qubit correlated dephasing error based onthe recovered matrix.
 3. The computer-implemented method of claim 1,wherein performing the randomized measurements includes preparingentangled Greenberger-Horne-Zeilinger states (GHZ states) on randomsubsets of the plurality of qubits.
 4. The computer-implemented methodof claim 1, wherein the randomized measurements are based on noisespectroscopy and quantum sensing.
 5. The computer-implemented method ofclaim 1, wherein the recovered matrix includes a restricted isometryproperty.
 6. The computer-implemented method of claim 1, furthercomprising estimating a vector of decay rates that are dependent on thematrix.
 7. The computer-implemented method of claim 6, wherein therecovered matrix is further based on the estimated decay rate.
 8. Thecomputer-implemented method of claim 6, further comprising estimating arelaxation time of the quantum system based on the estimated vector ofdecay rates.
 9. The computer-implemented method of claim 6, furthercomprising estimating a decoherence time of the quantum system based onthe estimated vector of decay rates.
 10. The computer-implemented methodof claim 1, further comprising detecting a long-range correlateddephasing error based on the recovered matrix.
 11. Acomputer-implemented method for detecting a two-qubit correlateddephasing error, the method comprising: accessing a signal of a quantumsystem, the quantum system includes a plurality of qubits, wherein everyqubit of the plurality of qubits has a nonzero rate of dephasing andwherein some qubits have a nonzero rate of correlated dephasing;dephasing entangled states of the plurality of qubits based onperforming Ramsey spectroscopy using entangled states of random subsetsof qubits of the plurality of qubits; measuring a linear function of acorrelation matrix, where the correlation matrix corresponds tocorrelated Markovian dephasing between pairs of qubits of the pluralityof qubits; generating a first vector and a second vector, wherein aplurality of elements of the first vector and of the second vector arerandomly chosen; and estimating a decay rate based on the first vectorand the second vector.
 12. The computer-implemented method of claim 11,further comprising generating a restricted isometry property-basedrecovery matrix, wherein the recovery matrix is further based on theestimated decay rate.
 13. The computer-implemented method of claim 12,further comprising detecting a long-range correlated dephasing errorbased on the recovery matrix.
 14. The computer-implemented method ofclaim 11, further comprising estimating a relaxation time of the quantumsystem based on the estimated decay rate.
 15. The computer-implementedmethod of claim 11, further comprising estimating a decoherence time ofthe quantum system based on the estimated decay rate.
 16. A system fordetecting a two-qubit correlated dephasing error, the system comprising:a processor; and a memory, including instructions stored thereon, whichwhen executed by the processor, cause the system to: access a signal ofa quantum system, the signal includes a plurality of qubits, whereinevery qubit has a nonzero dephasing rate; and generate a matrix C inR^(n×n), wherein its off-diagonal part is 2s sparse and wherein thematrix C is based on the accessed signal.
 17. The system of claim 16,wherein the instructions, when executed by the processor, further causethe system to: perform randomized measurements based on: estimating(b−a)^(T)C(b−a) for any a≠b in {0,1}n; choosing a sequence of randomvectors r(1), . . . , r(m)in {1,0,−1}n; and estimatingΦC=(r^((1){circumflex over ( )}T)Cr⁽¹⁾), . . . ,(r^((m){circumflex over ( )}T)Cr^((m)), where Φ:R^(n×n)→R^(m).
 18. Thesystem of claim 17, wherein the instructions, when executed by theprocessor, further cause the system to: estimate a decay rate Γ_(ab)based on any a≠b in {0,1}^(n); and estimate a decoherence time of thequantum system based on the decay rate Γ_(ab).
 19. The system of claim18, wherein the instructions, when executed by the processor, furthercause the system to: recover Matrix C based on: measuring a plurality ofdiagonal elements of matrix C directly; and determining an off-diagonalelement of matrix C based on using

₁-regularized least-squares regression.
 20. The system of claim 19,wherein the instructions, when executed by the processor, further causethe system to: detect a long-range correlated dephasing error based onthe recovered matrix C.